Fine-Grained Polarized Reram-Based In-Situ Computation for Mixed-Signal DNN Accelerator

Total Page:16

File Type:pdf, Size:1020Kb

Fine-Grained Polarized Reram-Based In-Situ Computation for Mixed-Signal DNN Accelerator FORMS: Fine-grained Polarized ReRAM-based In-situ Computation for Mixed-signal DNN Accelerator Geng Yuan?1, Payman Behnam?2, Zhengang Li1, Ali Shafiee3, Sheng Lin1, Xiaolong Ma1, Hang Liu4, Xuehai Qian5, Mahdi Nazm Bojnordi6, Yanzhi Wang1, Caiwen Ding7 1Northeastern University, 2Georgia Institute of Technology, 3Samsung, 4Stevens Institute of Technology, 5University of Southern California, 6University of Utah, 7University of Connecticut 1fyuan.geng, li.zhen, lin.sheng, ma.xiaol, [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected] Abstract—Recent work demonstrated the promise of using storage and computation units are separated. To reduce data resistive random access memory (ReRAM) as an emerging movement, model compression techniques [2–4] and hardware technology to perform inherently parallel analog domain in-situ accelerators [5–11] have been intensively investigated. How- matrix-vector multiplication—the intensive and key computation in deep neural networks (DNNs). One key problem is the weights ever, as Moore’s law is reaching an end [12], the potential of that are signed values. However, in a ReRAM crossbar, weights the acceleration architecture based on conventional technology are stored as conductance of the crossbar cells, and the in-situ is still limited. We argue that drastic improvements can be only computation assumes all cells on each crossbar column are of achieved by 1) the next-generation emerging device/circuit the same sign. The current architectures either use two ReRAM technology beyond CMOS; and 2) the vertical integration [13] crossbars for positive and negative weights (PRIME), or add an offset to weights so that all values become positive (ISAAC). and optimization of algorithm, architecture, and technology Neither solution is ideal: they either double the cost of crossbars, innovations to deliver better overall performance and energy or incur extra offset circuity. To better address this problem, we efficiency for various applications. propose FORMS, a fine-grained ReRAM-based DNN accelerator with algorithm/hardware co-design. Instead of trying to represent A promising emerging technology is the recently discov- the positive/negative weights, our key design principle is to ered resistive random access memory (ReRAM) [14, 15] enforce exactly what is assumed in the in-situ computation— devices that are able to perform the inherently parallel in- ensuring that all weights in the same column of a crossbar situ matrix-vector multiplication in the analog domain. This have the same sign. It naturally avoids the cost of an additional key feature has been applied to several significant problems, crossbar. Such polarized weights can be nicely generated using alternating direction method of multipliers (ADMM) regularized including solving systems of linear equations in O(1) time optimization during the DNN training, which can exactly enforce complexity [16], and more interestingly, building DNN accel- certain patterns in DNN weights. To achieve high accuracy, we erators [17–24]. Since the key computation in DNNs can be divide the crossbar into logical sub-arrays and only enforce this essentially expressed as matrix-vector multiplication, ReRAM property within the fine-grained sub-array columns. Crucially, crossbars can naturally accelerate DNNs with much less data the small sub-arrays provides a unique opportunity for input zero- skipping, which can significantly avoid unnecessary computations movement and low-cost computation. Due to their promising and reduce computation time. At the same time, it also makes the results, structured pruning [3] and quantization [4] are also hardware much easier to implement and is less susceptible to non- developed to facilitate a smaller number of weights and bits idealities and noise than coarse-grained architectures. Putting all to reduce the amount of computation and ReRAM hardware together, with the same optimized DNN models, FORMS achieves GOP s resources. 1.50× and 1.93× throughput improvement in terms of s×mm2 GOP s On the other side, a key complication of the ReRAM- and W compared to ISAAC, and 1.12× ∼ 2.4× speed up in arXiv:2106.09144v1 [cs.AR] 16 Jun 2021 terms of frame per second over optimized ISAAC with almost based DNN accelerator is that although the weights stored the same power/area cost. Interestingly, FORMS optimization in ReRAM crossbar cells can be either positive or negative, framework can even speed up the original ISAAC from 10.7× the in-situ computation assumes all cells in each crossbar up to 377.9×, reflecting the importance of software/hardware co-design optimizations. column hold values of the same sign, i.e., all positive or all negative. There are two approaches to tackle the problem. I. INTRODUCTION The general way is to use two ReRAM crossbars to hold the Deep Neural Networks (DNNs) have become the funda- positive and negative magnitudes weights separately, doubling mental element and core enabler of ubiquitous artificial intel- the ReRAM portion of hardware cost [17, 25–28]. In contrast, ligence, thanks to their high accuracy, excellent scalability, and ISAAC [18] adds an offset to weights so that all values become self-adaptiveness [1]. As the ever-growing DNN model size, positive. While keeping the number of crossbars the same, the high computation and memory storage of DNN models the latter approach introduces additional hardware costs for introduce substantial data movements, posing key challenges the peripheral circuits by adding extra offset circuits and may to the conventional Von Neumann architectures, where weight also decrease the network robustness to hardware failures [29]. We argue that both solutions are not ideal, and we attempt to ?These Authors contributed equally. develop an alternative approach with better cost/performance Training with ADMM regularized optimization Sign Indicator Driver Driver } Optimized Pre-trained Fragment MCU MCU Weight Pruning Quantization ReRAM-aware DAC DAC Polarization zero-skip DNN Model zero-skip DNN Model S&H S&H MCU MCU ADCs ADCs Pruned Driver Driver MCU MCU Polarized Controller Data-Flow Controller DAC DAC zero-skip zero-skip Quantized MCU MCU S&H S&H after pruning positive weight ADCs ADCs pruned weight 2D weight format negative weight Tile MCU FORMS Optimization Framwork FORMS Accelerator Architecture Design Fig. 1: Overall Flow of FORMS Algorithm/Hardware Co-designed. trade-offs. fine-grained computation are easier to implement than large Different from the previous approaches, which use addi- ADCs for coarse-grained computation, and make the design tional hardware to “fix” the problem, our design principle is to less susceptible to non-idealities and noise [31, 32]. enforce exactly what is assumed in the in-situ computation— Putting all together, this paper proposes FORMS, the first ensuring the pattern that all weights in the same column of a algorithm/hardware co-designed fine-grained ReRAM archi- crossbar have the same sign. This idea takes the opportunity tecture solution leveraging polarized weights. The overall of algorithm and hardware co-design, and is motivated by flow of FORMS algorithm/hardware co-design is shown in the capability of the powerful alternating direction method Figure 1. Starting with a pretrained model, we first apply of multipliers (ADMM) regularized optimization [30], which structured pruning. Then, the weight matrix in the pruned is exactly able to enforce patterns in DNN training while model is divided into fixed-sized fragments, and each frag- maintaining high accuracy. Based on this idea, we can train ment corresponds to a column of the crossbar sub-array. By a DNN model using ADMM with our novel constraints such incorporating our fragment polarization constraints in ADMM that the weights mapped to the same crossbar columns are all regularized training, the weights in each fragment will be positive or negative. With the typical ReRAM crossbar size, trained to have the same sign. Please note that we are not e.g., 128 × 128, we found that the “same-sign” property of moving the weights around to form polarized fragments. the coarse granularity, i.e., the whole column of 128 weights, Finally, the quantization is applied to reduce the number of bits can lead to accuracy degradation. To maintain high accuracy, required for each weight. With the multi-step ADMM-based we propose to divide the crossbar into smaller logical sub- structured pruning, polarization, and quantization, a significant arrays and only enforce the property within the fine-grained model compression ratio is achieved. sub-array columns, and use fine-grained computation instead At the hardware level, we design a fine-grained DNN of coarse-grained computation (i.e., calculate fine-grained sub- accelerator architecture leveraging fine-grained computations. array column at each time). However, this raises another A novel zero-skipping logic is developed to control the shift problem: the mainstream designs take advantage of coarse- of input bit streams on-the-fly and avoids entering the unnec- grained computation to achieve high performance (i.e., frames essary zero bits into each layer of the DNNs-based ReRAMs per second (FPS)). For a fine-grained design to reach a similar to eliminate useless computations and start computation on FPS to that of coarse-grained designs, if no other optimizations the next input early. Applied to the inputs of a small sub- are applied, more parallel analog-to-digital converters (ADCs) array (fragment), zero-skipping significantly reduces the frame are needed to operate in parallel to compensate for the processing time and energy consumption. performance degradation caused by computation granularity, II. BACKGROUND AND CHALLENGES which will generally lead to a higher hardware overhead than the coarse-grained designs. A. ReRAM Crossbar and ReRAM-Based DNN Acceleration Is this yet another failed idea that seems to work at the Recently, there is significant progress in fabricating non- first thought but actually not after deliberation? Fortunately, volatile memories.
Recommended publications
  • Analyzing the Performance of Intel Optane DC Persistent Memory in Storage Over App Direct Mode
    Front cover Analyzing the Performance of Intel Optane DC Persistent Memory in Storage over App Direct Mode Introduces DCPMM Storage over Evaluates workload latency and App Direct Mode bandwidth performance Establishes performance Illustrates scaling performance for expectations for DCPMM DCPMM populations capacities Travis Liao Tristian "Truth" Brown Jamie Chou Abstract Intel Optane DC Persistent Memory is the latest memory technology for Lenovo® ThinkSystem™ servers. This technology deviates from contemporary flash storage offerings and utilizes the ground-breaking 3D XPoint non-volatile memory technology to deliver a new level of versatile performance in a compact memory module form factor. As server storage technology continues to advance from devices behind RAID controllers to offerings closer to the processor, it is important to understand the technological differences and suitable use cases. This paper provides a look into the performance of Intel Optane DC Persistent Memory Modules configured in Storage over App Direct Mode operation. At Lenovo Press, we bring together experts to produce technical publications around topics of importance to you, providing information and best practices for using Lenovo products and solutions to solve IT challenges. See a list of our most recent publications at the Lenovo Press web site: http://lenovopress.com Do you have the latest version? We update our papers from time to time, so check whether you have the latest version of this document by clicking the Check for Updates button on the front page of the PDF. Pressing this button will take you to a web page that will tell you if you are reading the latest version of the document and give you a link to the latest if needed.
    [Show full text]
  • Initial Experience with 3D Xpoint Main Memory
    Initial Experience with 3D XPoint Main Memory Jihang Liu Shimin Chen* State Key Laboratory of Computer Architecture State Key Laboratory of Computer Architecture Institute of Computing Technology, CAS Institute of Computing Technology, CAS University of Chinese Academy of Sciences University of Chinese Academy of Sciences [email protected] [email protected] Abstract—3D XPoint will likely be the first commercially available main memory NVM solution targeting mainstream computer systems. Previous database studies on NVM memory evaluate their proposed techniques mainly on simulated or emulated NVM hardware. In this paper, we report our initial experience experimenting with the real 3D XPoint main memory hardware. I. INTRODUCTION A new generation of Non-Volatile Memory (NVM) tech- nologies, including PCM [1], STT-RAM [2], and Memris- tor [3], are expected to address the DRAM scaling problem and become the next generation main memory technology Fig. 1. AEP machine. in future computer systems [4]–[7]. NVM has become a hot topic in database research in recent years. However, previous First, there are two CPUs in the machine. From the studies evaluate their proposed techniques mainly on simulated /proc/cpuinfo, the CPU model is listed as “Genuine In- or emulated hardware. In this paper, we report our initial tel(R) CPU 0000%@”. Our guess is that the CPU is a test experience with real NVM main memory hardware. model and/or not yet recognized by the Linux kernel. The 3D XPoint is an NVM technology jointly developed by CPU supports the new clwb instruction [9]. Given a memory Intel and Micron [8]. Intel first announced the technology address, clwb writes back to memory the associated cache in 2015.
    [Show full text]
  • After the 3D Xpoint Take-Off, Emerging NVM Keeps Growing1
    Press Release February 26, 2020 LYON, France After the 3D XPoint take-off, emerging NVM keeps growing1 OUTLINES: PCM2-based persistent memory hit the market in 2019. And embedded STT-MRAM3 has finally moved into mass production. The overall emerging NVM4 market is poised to reach over US$6 billion in 2025. The embedded business will account for 1/3 of it. Foundries and IDMs5 are accelerating the adoption of embedded emerging NVMs at advanced technology nodes. Rise of in-memory computing is triggering new players’ dynamics. “The stand-alone emerging NVM market will grow to over US$4 billion in 2025”, explains Simone Bertolazzi, PhD. Technology & Market Analyst at Yole Développement (Yole). “It will be driven by two key segments: low-latency storage (enterprise and client drives) and persistent memory (NVDIMMs).” 1 Extracted from Emerging Non-Volatile Memory, Yole Développement, 2020 2 PCM: Phase-Change Memory 3 STT- MRAM : Spin-Transfer Torque Magnetoresistive RAM 4 NVM: Non-Volatile Memory 5 IDM : Integrated Design Manufacturer Press Release The embedded emerging NVM entered the takeoff phase. The embedded market segment is showing a 118% CAGR6 between 2019 and 2025, reaching more than US$2 billion by 2025. In this dynamic context, the market research and strategy consulting company Yole, releases its annual memory report, Emerging Non-Volatile Memory. The 2020 edition presents an overview of the semiconductor memory market with stand-alone7 and embedded memories8. Yole’s memory team proposes today a deep understanding of emerging NVM applications with related market drivers, challenges, technology roadmap, players, and main trends. This report also offers detailed 2019-2025 market forecasts.
    [Show full text]
  • Transforming Business with Data
    WHITE PAPER Transforming Business with Data Enterprise infrastructure is pushing the boundaries of data in scope, size, and speed. The right infrastructure tools can help you make better business decisions by getting more value from data. Writing about the data explosion is the IT equivalent of “it was a dark and stormy night.” At this point, it’s an old narrative that misstates the challenge and actually understates the problem. Yes, there’s more data, but there are also more types of data, which further complicates how to store, activate, and monetize this modern-day resource that promises to transform companies and industries. Advanced analytics capabilities, like artificial intelligence (AI) and machine learning, make the machine-generated unstructured and semi-structured data coming from the Internet of Things (IoT) accessible and, thus, valuable. But many companies are still not realizing the full potential of even their conventional, structured data. Less than half of companies’ structured data is being actively used by decision-makers, and less than one percent of companies’ unstructured data is being analyzed or used at all.1 This data needs to be activated through better storage technologies so that companies can realize the full value of their data. One way in which Intel helps activate data is by innovatively disrupting data centers’ data tiers. Traditionally, IT organizations have faced a stark tradeoff with their storage: faster storage was much more expensive than slower storage. It only made economic sense to place the most-important, most-accessed data in fast storage. But if storage can be made faster—particularly less expensive storage— without a proportionate rise in price, more data can be analyzed and acted upon.
    [Show full text]
  • Arxiv:1908.03583V1 [Cs.DC] 9 Aug 2019 1 Introduction
    An Empirical Guide to the Behavior and Use of Scalable Persistent Memory Jian Yang Juno Kim Morteza Hoseinzadeh Joseph Izraelevitz Steven Swanson* Nonvolatile Systems Laboratory Computer Science & Engineering University of California, San Diego August 13, 2019 Abstract After nearly a decade of anticipation, scalable nonvolatile memory DIMMs are finally commercially available with the release of Intel’s 3D XPoint DIMM. This new nonvolatile DIMM supports byte-granularity accesses with access times on the order of DRAM, while also providing data storage that survives power outages. Researchers have not idly waited for real nonvolatile DIMMs (NVDIMMs) to arrive. Over the past decade, they have written a slew of papers proposing new programming models, file systems, libraries, and applications built to exploit the performance and flexibility that NVDIMMs promised to deliver. Those papers drew conclusions and made design decisions without detailed knowledge of how real NVDIMMs would behave or how industry would integrate them into computer architectures. Now that 3D XPoint NVDIMMs are actually here, we can provide detailed performance numbers, concrete guidance for programmers on these systems, reevaluate prior art for performance, and reoptimize persistent memory software for the real 3D XPoint DIMM. In this paper, we explore the performance properties and characteristics of Intel’s new 3D XPoint DIMM at the micro and macro level. First, we investigate the basic characteristics of the device, taking special note of the particular ways in which its performance is peculiar relative to traditional DRAM or other past methods used to emulate NVM. From these observations, we recommend a set of best practices to maximize the performance of the device.
    [Show full text]
  • Intel® Optane™ DC Persistent Memory – Telecom Use Case Workloads
    Application note Intel Corporation Intel® Optane™ DC Persistent Memory – Telecom Use Case Workloads Author 1 Introduction David Kennedy This document provides an overview of Intel® Optane™ DC persistent memory for 2nd generation Intel® Xeon® Scalable processors (formerly Cascade Lake). The document contains a description of the technology and discusses some use cases for telecom environments. The requirement for large memory footprints for artificial intelligence and media/streaming servers, as well as low latency access to large subscriber databases and routing tables in telecom networks provides a challenge for customers in deploying manageable nodes in the network. Often, instances of these databases are stored across multiple nodes, which increases the total cost of ownership and manageability costs. Also, due to the different latency requirements, there are often multiple copies of database instances across the network, plus data duplication for redundancy. Intel Optane DC persistent memory is highly beneficial for applications that require high capacity, low latency, and resiliency, which are applicable across a number of telecom specific use cases, including: • In-memory databases • Analytics • Virtualization • HPC clusters Intel Optane DC persistent memory provides the opportunity to: • Consolidate databases into smaller, easier to manage instances. • Position large datasets, such as network telemetry, right next to the CPU, directly in-memory, for faster big data analytics for network automation. • Enable a whole new class of shared, distributed data layer to enable cloud native infrastructure with ultra low latency, similar to DRAM, but without the cost of DRAM. • Cache rapidly emerging live linear video content or highest performance video-on- demand in a CDN with capacity-like storage, but performance-like memory, at a lower cost.
    [Show full text]
  • From Flash to 3D Xpoint: Performance Bottlenecks and Potentials in Rocksdb with Storage Evolution
    From Flash to 3D XPoint: Performance Bottlenecks and Potentials in RocksDB with Storage Evolution Yichen Jia Feng Chen Computer Science and Engineering Computer Science and Engineering Louisiana State University Louisiana State University [email protected] [email protected] Abstract—Storage technologies have undergone continuous much lower latency and higher throughput. Most importantly, innovations in the past decade. The latest technical advance- 3D XPoint SSD significantly alleviates many long-existing ment in this domain is 3D XPoint memory. As a type of Non- concerns on flash SSDs, such as the read-write speed disparity, volatile Memory (NVM), 3D XPoint memory promises great improvement in performance, density, and endurance over NAND slow random write, and endurance problems [26], [42]. Thus flash memory. Compared to flash based SSDs, 3D XPoint based 3D XPoint SSD is widely regarded as a pivotal technology for SSDs, such as Intel’s Optane SSD, can deliver unprecedented low building the next-generation storage system in the future. latency and high throughput. These properties are particularly On the software side, in order to accelerate I/O-intensive appealing to I/O intensive applications. Key-value store is such services in data centers, RocksDB [11], which is the state- an important application in data center systems. This paper presents the first, in-depth performance study of-art Log-structured Merge tree (LSM-tree) based key-value on the impact of the aforesaid storage hardware evolution store, has been widely used in various major data storage to RocksDB, a highly popular key-value store based on Log- and processing systems, such as graph databases [8], stream structured Merge tree (LSM-tree).
    [Show full text]
  • Introducing 3D Xpoint™ Memory
    Rigel V I S I O N O F P R O C E S S O R – M E M O R Y S Y S T E M S MICRO 48 J. Thomas Pawlowski, Chief Technologist, Fellow [email protected] ©2015 Micron Technology, Inc. All rights reserved. Information, products, and/or specifications are subject to change without notice. All information is provided on an “AS IS” basis without warranties of any kind. December 8, 2015 Statements regarding products, including regarding their features, availability, functionality, or compatibility, Title Slide - White are provided for informational purposes only and do not modify the warranty, if any, applicable to any product. Drawings may not be to scale. Micron, the Micron logo, and all other Micron trademarks are the Alternate layout property of Micron Technology, Inc. All other trademarks are the property of their respective owners. for first slide in the deck. Memory diversity . 3D XPoint™ Memory . Some scaling observations . Changing Relationships . Unabated performance . Micron Automata Processor pressure . Abstraction and Future . Memory Technologies Systems . Emerging Memory . 11 Key Points Agenda Two-column layout, © 2015 Micron Technology, Inc. December 8, 2015 to be used with any number of items. Memory Diversity December 8, 2015 White Segue Diversified Memory Markets Automotive Storage Graphics / Consumer Networks Server Mobile Personal Computing IMM Title Only Includes only the title and subtitle, with a large open © 2015 Micron Technology, Inc. December 8, 2015 space in the middle of the slide. Smartphones and Tablets Enterprise and
    [Show full text]
  • Towards an Unwritten Contract of Intel Optane SSD Kan Wu, Andrea Arpaci-Dusseau and Remzi Arpaci-Dusseau University of Wisconsin-Madison
    Towards an Unwritten Contract of Intel Optane SSD Kan Wu, Andrea Arpaci-Dusseau and Remzi Arpaci-Dusseau University of Wisconsin-Madison Abstract Rule Impact Metric Cause Access with Low Request Scale 11x Latency SSD Controller/Interconnect Random Access is OK 7x (vs. Flash) Latency & Throughput 3D XPoint & Controller New non-volatile memory technologies offer unprece- Avoid Crowded Accesses 4.6x Latency & Throughput SSD Controller/Interconnect dented performance levels for persistent storage. However, to Control Overall Load 5x (vs. Flash) Latency 3D XPoint Memory Avoid Tiny Accesses 5x Throughput SSD Controller/Interconnect exploit their full potential, a deeper performance characteri- Issue 4KB Aligned Requests 1.2x Latency SSD Controller/Interconnect Forget Garbage Collection 15x (vs. Flash) Sustained Throughput 3D XPoint & Controller zation of such devices is required. In this paper, we analyze Table 1: Performance Impact of Rule Violations a NVM-based block device – the Intel Optane SSD – and formalize an “unwritten contract” of the Optane SSD. We rules that are critical for Optane SSD users. First, to obtain show that violating this contract can result in 11x worse read low latency, users should issue small requests ( > 4 KB) latency and limited throughput (only 20% of peak bandwidth) and keep a small number of outstanding IOs (Access with regardless of parallelism. We present that this contract is rel- Low Request Scale rule). Second, different from HDD/Flash evant to features of 3D XPoint memory and Intel Optane SSDs, Optane SSD clients should not consider sequential SSD’s controller/interconnect design. Finally, we discuss the workloads special (Random Access is OK rule).
    [Show full text]
  • Nvme Technology – Powering the Connected Universe
    NVMe® Technology Powering the Connected Universe Amber Huffman Fellow & Chief Technologist of IP Engineering Group, Intel Corporation President, NVM Express, Inc. Agenda Fixing the Memory & Storage Hierarchy Refactoring for the Next Decade of Growth NVMe® Architecture Advancements What’s Next: Computational Storage Intel Corporation 2 Agenda Fixing the Memory & Storage Hierarchy Refactoring for the Next Decade of Growth NVMe® Architecture Advancements What’s Next: Computational Storage Intel Corporation 3 Intel Corporation 4 Memory and Storage Hierarchy Gaps COMPUTE 100s MB, ~1ns CACHE IN- PACKAGE 1s GB, ~10ns MEMORY MEMORY 10s GB, <100ns Memory 100s GB, <1usec CAPACITY GAP STORAGE PERFORMANCE GAP 1s TB, <10µsecs 10s TB, <100µsecs Storage SECONDARY STORAGE COST- PERFORMANCE GAP TERTIARY STORAGE 10s TB, <10 msecs Intel Corporation 5 Memory and Storage Hierarchy Gaps COMPUTE SRAM CACHE Bandwidth IN- PACKAGE DRAM (HBM) MEMORY MEMORY DRAM 3D NAND Flash SECONDARY STORAGE HDD TERTIARY STORAGE Capacity For illustrative purposes only Intel Corporation 6 Types of Memory DRAM 3D XPOINT INTEL 3D NAND One DRAM memory cell = 1 bit One 3D XPoint memory cell = 1 bit One 3D NAND memory cell = 1-4 bits Intel Corporation 7 Types of Memory Compared DRAM 3 D XPOINT INTEL 3 D NAND One DRAM memory cell = 1 bit One 3D XPoint memory cell = 1 bit One 3D NAND memory cell = 1-4 bits Intel Corporation 8 3D NAND Roadmap 128L Industry Target 96L QLC Gen 3 64L QLC Gen 2 32L TLC Gen 1 128L LAYERS 96L 64L 32L 2016 2017 2019 2020 Intel Corporation 9 3D NAND Roadmap 144L
    [Show full text]
  • 3D X Point Technology
    IETE Zonal Seminar “Recent Trends in Engineering & Technology” - 2017 Special Issue of International Journal of Electronics, Communication & Soft Computing Science and Engineering, ISSN: 2277-9477 3D X Point Technology Mr. Umang D. Dadmal Rakshanda S. Vinkare Prof. P. G. Kaushik Prof. S. A Mishra Abstract – In this era, memory is considered to be an important This memory read and writes operation is performed by parameter for storing data, Previous memory like RAM ROM electron beams which are part of cathode rays tubes. Initially Flash Drive etc. have flaws which can be overcome with 3D X capacity of storage of RAM is very low; it can store only few Point Memory Technology.3D X Point technology is a new class hundred bits near about to thousand bits. But in comparison of non-volatile memory. This technology sharply increasing vacuum tubes the size of RAM is very small and these work because of its high speed/performance. It is up to 1,000 times lower latency & exponentially greater endurance than NAND & very fast than vacuum tubes and more power efficient. 10 time denser than DRAM.. In This paper we have discussed For digital storage, Williams Tube used a series of electronic about its unique and transistor less Architecture design. The in CRT. The capacity of RAM is near about 128 Byte. most immediate application of 3D X point based products will be as a layer of storage between DRAM and SSD. Key word- NAND Flash Memory, DRAM, 3D X Point Memory I. INTRODUCTION 3D X point memory is an entirely new class of nonvolatile memory.
    [Show full text]
  • Analyzing the Performance of Intel Optane DC Persistent Memory in App Direct Mode in Lenovo Thinksystem Servers
    Front cover Analyzing the Performance of Intel Optane DC Persistent Memory in App Direct Mode in Lenovo ThinkSystem Servers Introduces DCPMM App Direct Explores performance capabilities of Mode persistent memory Establishes performance Discusses configurations for optimal expectations for given workloads DCPMM performance Tristian “Truth” Brown Travis Liao Jamie Chou Abstract Intel Optane DC Persistent Memory is the latest memory technology for Lenovo ThinkSystem servers. This technology deviates from contemporary flash storage offerings and utilizes the ground-breaking 3D XPoint non-volatile memory technology to deliver a new level of versatile performance in a compact memory module form factor. This paper focuses on the low-level hardware performance capabilities of Intel Optane DC Persistent Memory configured in App Direct Mode operation. This paper provides the reader with an understanding of workloads to produce the highest level of performance with this innovative technology. At Lenovo Press, we bring together experts to produce technical publications around topics of importance to you, providing information and best practices for using Lenovo products and solutions to solve IT challenges. See a list of our most recent publications at the Lenovo Press web site: http://lenovopress.com Do you have the latest version? We update our papers from time to time, so check whether you have the latest version of this document by clicking the Check for Updates button on the front page of the PDF. Pressing this button will take you to a web page that will tell you if you are reading the latest version of the document and give you a link to the latest if needed.
    [Show full text]