Chapter 5 - Internal Memory
Total Page:16
File Type:pdf, Size:1020Kb
Load more
										Recommended publications
									
								- 
												  Solutions to Chapter 3Solutions to Chapter 3 1. Suppose the size of an uncompressed text file is 1 megabyte. Solutions follow questions: [4 marks – 1 mark each for a & b, 2 marks c] a. How long does it take to download the file over a 32 kilobit/second modem? T32k = 8 (1024) (1024) / 32000 = 262.144 seconds b. How long does it take to take to download the file over a 1 megabit/second modem? 6 T1M = 8 (1024) (1024) bits / 1x10 bits/sec = 8.38 seconds c. Suppose data compression is applied to the text file. How much do the transmission times in parts (a) and (b) change? If we assume a maximum compression ratio of 1:6, then we have the following times for the 32 kilobit and 1 megabit lines respectively: T32k = 8 (1024) (1024) / (32000 x 6) = 43.69 sec 6 T1M = 8 (1024) (1024) / (1x10 x 6) = 1.4 sec 2. A scanner has a resolution of 600 x 600 pixels/square inch. How many bits are produced by an 8-inch x 10-inch image if scanning uses 8 bits/pixel? 24 bits/pixel? [3 marks – 1 mark for pixels per picture, 1 marks each representation] Solution: The number of pixels is 600x600x8x10 = 28.8x106 pixels per picture. With 8 bits/pixel representation, we have: 28.8x106 x 8 = 230.4 Mbits per picture. With 24 bits/pixel representation, we have: 28.8x106 x 24 = 691.2 Mbits per picture. 6. Suppose a storage device has a capacity of 1 gigabyte. How many 1-minute songs can the device hold using conventional CD format? using MP3 coding? [4 marks – 2 marks each] Solution: A stereo CD signal has a bit rate of 1.4 megabits per second, or 84 megabits per minute, which is approximately 10 megabytes per minute.
- 
												  Memory & DevicesMemory & Devices Memory • Random Access Memory (vs. Serial Access Memory) • Different flavors at different levels – Physical Makeup (CMOS, DRAM) – Low Level Architectures (FPM,EDO,BEDO,SDRAM, DDR) • Cache uses SRAM: Static Random Access Memory – No refresh (6 transistors/bit vs. 1 transistor • Main Memory is DRAM: Dynamic Random Access Memory – Dynamic since needs to be refreshed periodically (1% time) – Addresses divided into 2 halves (Memory as a 2D matrix): • RAS or Row Access Strobe • CAS or Column Access Strobe Random-Access Memory (RAM) Key features – RAM is packaged as a chip. – Basic storage unit is a cell (one bit per cell). – Multiple RAM chips form a memory. Static RAM (SRAM) – Each cell stores bit with a six-transistor circuit. – Retains value indefinitely, as long as it is kept powered. – Relatively insensitive to disturbances such as electrical noise. – Faster and more expensive than DRAM. Dynamic RAM (DRAM) – Each cell stores bit with a capacitor and transistor. – Value must be refreshed every 10-100 ms. – Sensitive to disturbances. – Slower and cheaper than SRAM. Semiconductor Memory Types Static RAM • Bits stored in transistor “latches” à no capacitors! – no charge leak, no refresh needed • Pro: no refresh circuits, faster • Con: more complex construction, larger per bit more expensive transistors “switch” faster than capacitors charge ! • Cache Static RAM Structure 1 “NOT ” 1 0 six transistors per bit 1 0 (“flip flop”) 0 1 0/1 = example 0 Static RAM Operation • Transistor arrangement (flip flop) has 2 stable logic states • Write 1. signal bit line: High à 1 Low à 0 2. address line active à “switch” flip flop to stable state matching bit line • Read no need 1.
- 
												  Performance Impact of Memory Channels on Sparse and Irregular AlgorithmsPerformance Impact of Memory Channels on Sparse and Irregular Algorithms Oded Green1,2, James Fox2, Jeffrey Young2, Jun Shirako2, and David Bader3 1NVIDIA Corporation — 2Georgia Institute of Technology — 3New Jersey Institute of Technology Abstract— Graph processing is typically considered to be a Graph algorithms are typically latency-bound if there is memory-bound rather than compute-bound problem. One com- not enough parallelism to saturate the memory subsystem. mon line of thought is that more available memory bandwidth However, the growth in parallelism for shared-memory corresponds to better graph processing performance. However, in this work we demonstrate that the key factor in the utilization systems brings into question whether graph algorithms are of the memory system for graph algorithms is not necessarily still primarily latency-bound. Getting peak or near-peak the raw bandwidth or even the latency of memory requests. bandwidth of current memory subsystems requires highly Instead, we show that performance is proportional to the parallel applications as well as good spatial and temporal number of memory channels available to handle small data memory reuse. The lack of spatial locality in sparse and transfers with limited spatial locality. Using several widely used graph frameworks, including irregular algorithms means that prefetched cache lines have Gunrock (on the GPU) and GAPBS & Ligra (for CPUs), poor data reuse and that the effective bandwidth is fairly low. we evaluate key graph analytics kernels using two unique The introduction of high-bandwidth memories like HBM and memory hierarchies, DDR-based and HBM/MCDRAM. Our HMC have not yet closed this inefficiency gap for latency- results show that the differences in the peak bandwidths sensitive accesses [19].
- 
												  Digital Measurement UnitsDigital Measurement units Kilobit = 1 kbit = 1,000bits Kilobit per second (kbit/s or kb/s or kbps) is a unit of data transfer rate equal to 1,000 bits per second. It is sometimes mistakenly thought to mean 1,024 bits per second, using the binary meaning of the kilo- prefix, though this is incorrect. Examples • 56k modem — 56,000 bit/s • 128 kbit/s mp3 — 128,000 bit/s [1] • 64k ISDN — 64,000 bit/s [2] • 1536k T1 — 1,536,000 bit/s (1.536 Mbit/s) Most digital representations of audio are measured in kbit/s: (These values vary depending on audio data compression schemes) • 4 kbit/s – minimum necessary for recognizable speech (using special-purpose speech codecs) • 8 kbit/s – telephone quality • 32 kbit/s – MW quality • 96 kbit/s – FM quality • 192 kbit/s – Nearly CD quality for a file compressed in the MP3 format • 1,411 kbit/s – CD audio (at 16-bits for each channel and 44.1 kHz) Megabit (Mb)= 106 = 1,000,000 bits which is equal to 125,000 bytes or 125 kilobytes. Megabit per second (abbreviated as Mbps, Mbit/s, or mbps) is a unit of data transfer rates equal to 1,000,000 bits per second (this equals about 976 kilobits per second). Because there are 8 bits in a byte, a transfer speed of 8 megabits per second (8 Mbps) is equivalent to 1,000,000 bytes per second (approximately 976 KiB/s). Usage Examples: The bandwidth of consumer broadband internet services is often rated in Mbps.
- 
												  DDR and DDR2 SDRAM Controller Compiler User GuideDDR and DDR2 SDRAM Controller Compiler User Guide 101 Innovation Drive Software Version: 9.0 San Jose, CA 95134 Document Date: March 2009 www.altera.com Copyright © 2009 Altera Corporation. All rights reserved. Altera, The Programmable Solutions Company, the stylized Altera logo, specific device designations, and all other words and logos that are identified as trademarks and/or service marks are, unless noted otherwise, the trademarks and service marks of Altera Corporation in the U.S. and other countries. All other product or service names are the property of their respective holders. Altera products are protected under numerous U.S. and foreign patents and pending ap- plications, maskwork rights, and copyrights. Altera warrants performance of its semiconductor products to current specifications in accordance with Altera's standard warranty, but reserves the right to make changes to any products and services at any time without notice. Altera assumes no responsibility or liability arising out of the application or use of any information, product, or service described herein except as expressly agreed to in writing by Altera Corporation. Altera customers are advised to obtain the latest version of device specifications before relying on any published information and before placing orders for products or services. UG-DDRSDRAM-10.0 Contents Chapter 1. About This Compiler Release Information . 1–1 Device Family Support . 1–1 Features . 1–2 General Description . 1–2 Performance and Resource Utilization . 1–4 Installation and Licensing . 1–5 OpenCore Plus Evaluation . 1–6 Chapter 2. Getting Started Design Flow . 2–1 SOPC Builder Design Flow . 2–1 DDR & DDR2 SDRAM Controller Walkthrough .
- 
												  AXP Internal 2-Apr-20 12-Apr-20 AXP Internal 1 2-Apr-20 AXP Internal 2 2-Apr-20 AXP Internal 3 2-Apr-20 AXP Internal 4 2-Apr-20 AXP Internal 5 2-Apr-20 AXP Internal 6 Class 6 Subject: Computer Science Title of the Book: IT Planet Petabyte Chapter 2: Computer Memory GENERAL INSTRUCTIONS: • Exercises to be written in the book. • Assignment questions to be done in ruled sheets. • You Tube link is for the explanation of Primary and Secondary Memory. YouTube Link: ➢ https://youtu.be/aOgvgHiazQA INTRODUCTION: ➢ Computer can store a large amount of data safely in their memory for future use. ➢ A computer’s memory is measured either in Bits or Bytes. ➢ The memory of a computer is divided into two categories: Primary Memory, Secondary Memory. ➢ There are two types of Primary Memory: ROM and RAM. ➢ Cache Memory is used to store program and instructions that are frequently used. EXPLANATION: Computer Memory: Memory plays a very important role in a computer. It is the basic unit where data and instructions are stored temporarily. Memory usually consists of one or more chips on the mother board, or you can say it consists of electronic components that store instructions waiting to be executed by the processor, data needed by those instructions, and the results of processing the data. Memory Units: Computer memory is measured in bits and bytes. A bit is the smallest unit of information that a computer can process and store. A group of 4 bits is known as nibble, and a group of 8 bits is called byte.
- 
												  How Many Bits Are in a Byte in Computer TermsHow Many Bits Are In A Byte In Computer Terms Periosteal and aluminum Dario memorizes her pigeonhole collieshangie count and nagging seductively. measurably.Auriculated and Pyromaniacal ferrous Gunter Jessie addict intersperse her glockenspiels nutritiously. glimpse rough-dries and outreddens Featured or two nibbles, gigabytes and videos, are the terms bits are in many byte computer, browse to gain comfort with a kilobyte est une unité de armazenamento de armazenamento de almacenamiento de dados digitais. Large denominations of computer memory are composed of bits, Terabyte, then a larger amount of nightmare can be accessed using an address of had given size at sensible cost of added complexity to access individual characters. The binary arithmetic with two sets render everything into one digit, in many bits are a byte computer, not used in detail. Supercomputers are its back and are in foreign languages are brainwashed into plain text. Understanding the Difference Between Bits and Bytes Lifewire. RAM, any sixteen distinct values can be represented with a nibble, I already love a Papst fan since my hybrid head amp. So in ham of transmitting or storing bits and bytes it takes times as much. Bytes and bits are the starting point hospital the computer world Find arrogant about the Base-2 and bit bytes the ASCII character set byte prefixes and binary math. Its size can vary depending on spark machine itself the computing language In most contexts a byte is futile to bits or 1 octet In 1956 this leaf was named by. Pages Bytes and Other Units of Measure Robelle. This function is used in conversion forms where we are one series two inputs.
- 
												  Solving High-Speed Memory Interface Challenges with Low-Cost FpgasSOLVING HIGH-SPEED MEMORY INTERFACE CHALLENGES WITH LOW-COST FPGAS A Lattice Semiconductor White Paper May 2005 Lattice Semiconductor 5555 Northeast Moore Ct. Hillsboro, Oregon 97124 USA Telephone: (503) 268-8000 www.latticesemi.com Introduction Memory devices are ubiquitous in today’s communications systems. As system bandwidths continue to increase into the multi-gigabit range, memory technologies have been optimized for higher density and performance. In turn, memory interfaces for these new technologies pose stiff challenges for designers. Traditionally, memory controllers were embedded in processors or as ASIC macrocells in SoCs. With shorter time-to-market requirements, designers are turning to programmable logic devices such as FPGAs to manage memory interfaces. Until recently, only a few FPGAs supported the building blocks to interface reliably to high-speed, next generation devices, and typically these FPGAs were high-end, expensive devices. However, a new generation of low-cost FPGAs has emerged, providing the building blocks, high-speed FPGA fabric, clock management resources and the I/O structures needed to implement next generation DDR2, QDR2 and RLDRAM memory controllers. Memory Applications Memory devices are an integral part of a variety of systems. However, different applications have different memory requirements. For networking infrastructure applications, the memory devices required are typically high-density, high-performance, high-bandwidth memory devices with a high degree of reliability. In wireless applications, low-power memory is important, especially for handset and mobile devices, while high-performance is important for base-station applications. Broadband access applications typically require memory devices in which there is a fine balance between cost and performance.
- 
												  Semiconductor MemoriesSemiconductor Memories Prof. MacDonald Types of Memories! l" Volatile Memories –" require power supply to retain information –" dynamic memories l" use charge to store information and require refreshing –" static memories l" use feedback (latch) to store information – no refresh required l" Non-Volatile Memories –" ROM (Mask) –" EEPROM –" FLASH – NAND or NOR –" MRAM Memory Hierarchy! 100pS RF 100’s of bytes L1 1nS SRAM 10’s of Kbytes 10nS L2 100’s of Kbytes SRAM L3 100’s of 100nS DRAM Mbytes 1us Disks / Flash Gbytes Memory Hierarchy! l" Large memories are slow l" Fast memories are small l" Memory hierarchy gives us illusion of large memory space with speed of small memory. –" temporal locality –" spatial locality Register Files ! l" Fastest and most robust memory array l" Largest bit cell size l" Basically an array of large latches l" No sense amps – bits provide full rail data out l" Often multi-ported (i.e. 8 read ports, 2 write ports) l" Often used with ALUs in the CPU as source/destination l" Typically less than 10,000 bits –" 32 32-bit fixed point registers –" 32 60-bit floating point registers SRAM! l" Same process as logic so often combined on one die l" Smaller bit cell than register file – more dense but slower l" Uses sense amp to detect small bit cell output l" Fastest for reads and writes after register file l" Large per bit area costs –" six transistors (single port), eight transistors (dual port) l" L1 and L2 Cache on CPU is always SRAM l" On-chip Buffers – (Ethernet buffer, LCD buffer) l" Typical sizes 16k by 32 Static Memory
- 
												  Case Study on Integrated Architecture for In-Memory and In-Storage Computingelectronics Article Case Study on Integrated Architecture for In-Memory and In-Storage Computing Manho Kim 1, Sung-Ho Kim 1, Hyuk-Jae Lee 1 and Chae-Eun Rhee 2,* 1 Department of Electrical Engineering, Seoul National University, Seoul 08826, Korea; [email protected] (M.K.); [email protected] (S.-H.K.); [email protected] (H.-J.L.) 2 Department of Information and Communication Engineering, Inha University, Incheon 22212, Korea * Correspondence: [email protected]; Tel.: +82-32-860-7429 Abstract: Since the advent of computers, computing performance has been steadily increasing. Moreover, recent technologies are mostly based on massive data, and the development of artificial intelligence is accelerating it. Accordingly, various studies are being conducted to increase the performance and computing and data access, together reducing energy consumption. In-memory computing (IMC) and in-storage computing (ISC) are currently the most actively studied architectures to deal with the challenges of recent technologies. Since IMC performs operations in memory, there is a chance to overcome the memory bandwidth limit. ISC can reduce energy by using a low power processor inside storage without an expensive IO interface. To integrate the host CPU, IMC and ISC harmoniously, appropriate workload allocation that reflects the characteristics of the target application is required. In this paper, the energy and processing speed are evaluated according to the workload allocation and system conditions. The proof-of-concept prototyping system is implemented for the integrated architecture. The simulation results show that IMC improves the performance by 4.4 times and reduces total energy by 4.6 times over the baseline host CPU.
- 
												  Computer ArchitecturesComputer Architectures Processor+Memory Richard Šusta, Pavel Píša Czech Technical University in Prague, Faculty of Electrical Engineering AE0B36APO Computer Architectures Ver.1.00 2019 1 The main instruction cycle of the CPU 1. Initial setup/reset – set initial PC value, PSW, etc. 2. Read the instruction from the memory PC → to the address bus Read the memory contents (machine instruction) and transfer it to the IR PC+l → PC, where l is length of the instruction 3. Decode operation code (opcode) 4. Execute the operation compute effective address, select registers, read operands, pass them through ALU and store result 5. Check for exceptions/interrupts (and service them) 6. Repeat from the step 2 AE0B36APO Computer Architectures 2 Single cycle CPU – implementation of the load instruction lw: type I, rs – base address, imm – offset, rt – register where to store fetched data I opcode(6), 31:26 rs(5), 25:21 rt(5), 20:16 immediate (16), 15:0 ALUControl 25:21 WE3 SrcA Zero WE PC’ PC Instr A1 RD1 A RD ALU A RD Instr. A2 RD2 SrcB AluOut Data ReadData Memory Memory A3 Reg. WD3 File WD 15:0 Sign Ext SignImm B35APO Computer Architectures 3 Single cycle CPU – implementation of the load instruction lw: type I, rs – base address, imm – offset, rt – register where to store fetched data I opcode(6), 31:26 rs(5), 25:21 rt(5), 20:16 immediate (16), 15:0 Write on rising edge of CLK RegWrite = 1 ALUControl 25:21 WE3 SrcA Zero WE PC’ PC Instr A1 RD1 A RD ALU A RD Instr.
- 
												  ECE 571 – Advanced Microprocessor-Based Design Lecture 17ECE 571 { Advanced Microprocessor-Based Design Lecture 17 Vince Weaver http://web.eece.maine.edu/~vweaver [email protected] 3 April 2018 Announcements • HW8 is readings 1 More DRAM 2 ECC Memory • There's debate about how many errors can happen, anywhere from 10−10 error/bit*h (roughly one bit error per hour per gigabyte of memory) to 10−17 error/bit*h (roughly one bit error per millennium per gigabyte of memory • Google did a study and they found more toward the high end • Would you notice if you had a bit flipped? • Scrubbing { only notice a flip once you read out a value 3 Registered Memory • Registered vs Unregistered • Registered has a buffer on board. More expensive but can have more DIMMs on a channel • Registered may be slower (if it buffers for a cycle) • RDIMM/UDIMM 4 Bandwidth/Latency Issues • Truly random access? No, burst speed fast, random speed not. • Is that a problem? Mostly filling cache lines? 5 Memory Controller • Can we have full random access to memory? Why not just pass on CPU mem requests unchanged? • What might have higher priority? • Why might re-ordering the accesses help performance (back and forth between two pages) 6 Reducing Refresh • DRAM Refresh Mechanisms, Penalties, and Trade-Offs by Bhati et al. • Refresh hurts performance: ◦ Memory controller stalls access to memory being refreshed ◦ Refresh takes energy (read/write) On 32Gb device, up to 20% of energy consumption and 30% of performance 7 Async vs Sync Refresh • Traditional refresh rates ◦ Async Standard (15.6us) ◦ Async Extended (125us) ◦ SDRAM -