Chapter 1: Data Storage
1.1 Bits and Their Storage 1.2 Main Memory 1.3 Mass Storage 1.4 Representing Information as Bit Patterns 1.5 The Binary System 1.6 Storing Integers 1.7 Storing Fractions 1.8 Data Compression 1.9 Communication Errors
1 Computer == 0 and 1 ?
Hmm…every movie says so….
Chapter 2 2 Why I can’t see any 0 and 1 when opening up a computer??
Chapter 2 3 What’s really in the computer With the help of an oscilloscope (示波器)
Chapter 2 4 HIGH Bits LOW
Bit Binary Digit The basic unit of information in a computer A symbol whose meaning depends on the application at hand. Numeric value (1 or 0) Boolean value (true or false) Voltage (high or low)
For TTL, high=5V, low=0V
5 More bits?
kilobit (kbit) 103 210 megabit (Mbit) 106 220 gigabit (Gbit) 109 230 terabit (Tbit) 1012 240 petabit (Pbit) 1015 250 exabit (Ebit) 1018 260 zettabit (Zbit) 1021 270
6 Bit Patterns
All data stored in a computer are represented by patterns of bits: Numbers Text characters Images Sound Anything else…
7 Now you know what’s a ‘bit’, and then?
How to ‘compute’ bits? How to ‘implement’ it ? How to ‘store’ bit in a computer?
Chapter 2 8 How to do computation on Bits?
Chapter 2 9 Boolean Operations
Boolean operation any operation that manipulates one or more true/false values (e.g. 0=true, 1=false) Can be used to operate on bits Specific operations AND OR XOR NOT
10 AND, OR, XOR
11 Implement the Boolean operation on a computer
Chapter 2 12 Gates
Gates devices that produce the outputs of Boolean operations when given the operations’ input values Often implemented as electronic circuits Provide the building blocks from which computers are constructed
13 AND, OR, XOR, NOT Gates
Chapter 2 14 INPUT OUTPUT A B A NAND B NAND 001 011 101 NOR 110
XNOR
15 Implement a gate
You can implement it with transistors But today gates are normally packaged in Integrated Circuit (IC) forms
Chapter 2 16 17 How many gates in this picture?
NOT=?
NAND=?
NOR=?
Chapter 2 18 Classification of IC
Complexity Number of gates Small-Scale Integration (SSI) < 12 Medium-Scale Integration (MSI) 12-99
Large-Scale Integration (LSI) 100-999
Very Large-Scale Integration (VLSI) 10,000-99,999
Ultra Large-Scale Integration (ULSI) > 100,000
Chapter 2 19 How to ‘store’ the bit?
Chapter 2 20 Store the bit
What do we mean by ‘storing’ the data?
Bits in the computer at the i th second = Bits in the computer at the (i + 1) th second
For example If the current state of the bit is ‘1’, it should be still ‘1’ at the next second
21 Using Flip-flop to store the bit
Chapter 2 22 Flip-flops?
NOT THIS!! 23 Flip-Flops
Flip-Flop a circuit built from gates that can store one bit of data Two input lines One input line sets its stored value to 1 Another input line sets its stored value to 0 When both input lines are 0, the most recently stored value is preserved
24 Flip-Flops Illustrated
X Y Z
X Z 0 0 -- 0 1 0 Flip-Flop 1 0 1 Y 1 1 *
25 A Simple Flip-Flop Circuit
X
Y
26 Setting to 1 – Step a.
27 Setting to 1 – Step b.
28 Setting to 1 – Step c.
29 Try This
Show how the output can be turned to 0
30 Another Flip-Flop Circuit
X
Y
Try this: When X=0, Y=1 What would be the output?
31 Flip-Flops
A kind of storage Holds its value until the power is turned off
32 Types of Memory
Volatile memory A.k.a. dynamic memory Holds its value until the power is turned off Non-volatile memory Holds its value even after the power is off
33 Capacitors
Two metallic plates positioned in parallel Principle Tiny space between two plates Voltage thru, charge the plats Voltage off, charge remains on the plats Connect two plates, charge off the plates Many of these capacitors with their circuits on a wafer (chip) Even more volatile than flip-flops Must be replenished periodically
34 Flash Memory
Originally made from EEPROM (Electrically Erasable ProgrammableRe ad-Only Memory) Trapping electrons in silicon dioxide
SiO2 chambers/cell Chamber very small 90 angstroms 1 angstrom is 1/10,000,000,000 of a meter Non-volatile 35 History of Flash Memory
PROM->EPROM->EEPROM->Flash memory
Chapter 2 36 Comparison
EPROM (Erasable Programmable Read- Only Memory) Use UV to erase the cell/bit Erase the whole chip at a time EEPROM(Electrically Erasable Program mable Read-Only Memory) Use electric field Erase one bit at a time Flash memory Erase one chunk/block of bits at a time
Chapter 2 37 Types of Memory
Volatile memory A.k.a. dynamic memory Holds its value until the power is turned off Examples, flip-flops, capacitors Non-volatile memory Holds its value even after the power is off Examples, flash memory, magnetic storage, compact disk, etc
38 Hexadecimal Notation
Why? Things are in bit streams Stream = a long string of bits. Long bit streams
E.g., 1000 = 11111010002
Hexadecimal (16進位) notation A shorthand notation for streams of bits. More compact.
1000=3E816
39 The Hexadecimal System
1000 = 11111010002 =3E816
40 Chapter 1: Data Storage
1.1 Bits and Their Storage 1.2 Main Memory 1.3 Mass Storage 1.4 Representing Information as Bit Patterns 1.5 The Binary System 1.6 Storing Integers 1.7 Storing Fractions 1.8 Data Compression 1.9 Communications Errors
41 Storage Hierarchy
Speed Cost Capacity
Cache
Main Memory
e.g. Hard Disk On-line Storage
Near-line Storage e.g. Optical Disk
e.g. Magnetic Off-line Storage Tape low low large
42 CPU Cache
One type of memory On the CPU Have fast access time stores copies of the data from frequently used main memory locations. As long as most memory accesses are in the cached memory the average latency of memory accesses will be closer to the cache latency
Chapter 2 43 Main Memory: Basic
Cell Manageable memory unit Typically 8 bits Byte A string of 8 bits. High-order end The left end Low-order end The right end Least significant bit The last bit at the low-order end 44 A Byte-Size Memory Cell
45 Main Memory: Addresses
Address A “name” to uniquely identify one cell Consecutive numbers Usually starting at zero I.e., having an order “previous cell” and “next cell” have reasonable meanings Random Access Memory Memory where any cell can be accessed independently
46 Memory Cells
47 Not Quite the Metric System
K “Kilo-” normally means 1,000 Kilobyte = 210 = 1024 M “Mega-” normally means 1,000,000 Megabyte = 220 = 1,048,576 G “Giga-” normally means 1,000,000,000 Gigaabyte = 230 = 1,073,741,824
48 Mass Storage Systems
Non-Volatile Data remains when computer is off Usually much bigger than main memory Traditionally usually rotating disks Hard disk, floppy disk, CD-ROM Much slower than main memory (mechanical vs. electrical)
Data access must wait for seek time (head positioning)
Data access must wait for rotational latency
49 Disk arm
read/write head
50 Hard Disk
sector
track
51 固態硬碟(Solid-State Disk or SSD) No mechanical part Based on Flash memory technology Pros Quiet, low power, immune to vibration Cons More expensive (smaller capacity) Shorter life time (less number of read- write) Can’t recover data from a damaged disk
Chapter 2 52 Compact Disk
53 Multi-layered CD
pits
54 55 How about CD-R (writable CD)
Don’t have bumps on the surface A organic dye layer covers the CD 2 lasers write laser heats up the dye layer enough to make it opaque. read laser senses the difference between clear dye and opaque dye the same way it senses bumps -- it picks up on the difference in reflectivity.
Chapter 2 56 CD-RW
Use a special dye material Can change its transparency depending on the temperature
Chapter 2 57 Magnetic Tape
58 Files
File The unit of data stored on a mass storage system. Logical record and Field Natural groups of data within a file Physical record A block of data conforming to the physical characteristics of the storage device. E.g. disk sector
59 Logical vs. Physical Records
phone num address name
60 Two more logical records might be stored in the same physical record
One logical record might be split into two physical record
61 Buffers
Storage area used to hold data on a temporary basis Usually during the process of being transferred from one device to another Example: Printers with memory Holding portions of a document that have transferred to the printer but not yet printed Why we need buffer?
62 Why using buffer?
Device sharing E.g. 3 computers sharing the same printer Connect devices of different speeds E.g. Computer -> buffer -> printer
63 Chapter 1: Data Storage
1.1 Bits and Their Storage 1.2 Main Memory 1.3 Mass Storage 1.4 Representing Information as Bit Patterns 1.5 The Binary System 1.6 Storing Integers 1.7 Storing Fractions 1.8 Data Compression 1.9 Communications Errors
64 Bit Patterns
All data stored in a computer are represented by patterns of bits: Text characters: “I love computer” Numbers: 123456 Images: JPEG, GIF, etc Sound:
65 How we interpret “0110101”?
We use ‘coding’ to define the meaning of binary bits Human Language is one kind of coding e.g “gut” = gastrointestinal tract (English) = good (German) Many different coding schemes ASCII EBCDIC ISO/IEC …. 66 “CODE” = translation rule
What does this code say? 67 ASCII - “Hello.”
68 Representing Text
Printable character Letter, punctuation, etc. Assigned a unique bit pattern. ASCII 7-bit values For most symbols used in written English text Unicode 16-bit values For most symbols used in most world languages ISO 32-bit values
69 Representing Number
Binary notation Uses bits to represent a number in base two In practice Not all numbers can be represented
0.1666666666666666 Depends on the capability of your computer (CPU and OS)
How many bits are used in MAC OS?
70 Limitations
Overflow Happens when a number is too big to be represented
Truncation Happens when a number is between two representable numbers Think the real numbers..
I will say more about this in a little bit
71 Representing Images
Bitmap techniques Points, pixels e.g. 小畫家程式 Representing colors BMP, DIB, GIF and JPEG formats Scanners, digital cameras, computer displays Vector techniques Lines, scalable fonts, e.g., TrueType e.g. Acrobat Reader PostScript CAD systems, printers
72 Vector vs. bitmap
Chapter 2 73 Representing Sound
74 Chapter 1: Data Storage
1.1 Bits and Their Storage 1.2 Main Memory 1.3 Mass Storage 1.4 Representing Information as Bit Patterns 1.5 The Binary System 1.6 Storing Integers 1.7 Storing Fractions 1.8 Data Compression 1.9 Communications Errors
75 Decimal vs. Binary Systems
102 101 100
23 22 21 20
76 From Binary to Decimal
77 From Decimal to Binary
78 An Example
13 A 23 B 22 C 2 D
Polly Huang, NTU EE Chapter 2 79 Adding Binary Numbers
00111010 00011011 01010101
80 Fraction in Binary System
22 21 20 2-1 2-2 2-3
81 Adding Fractions
00010.011 00100.110 00111.001
82 Chapter 1: Data Storage
1.1 Bits and Their Storage 1.2 Main Memory 1.3 Mass Storage 1.4 Representing Information as Bit Patterns 1.5 The Binary System 1.6 Storing Integers 1.7 Storing Fractions 1.8 Data Compression 1.9 Communications Errors
83 Two Kinds of Integers
Unsigned integers Numbers that are always positive Signed integers Numbers that can be positive or negative
84 Representing Signed Integers
Two’s complement notation The most popular representation Excess notation A less popular representation Only as examples, different computer/OS can represent them in slightly different ways
85 Two’s Complement Notation
Chapter 2 86 Coding –6 using Four Bits
87 Adding Numbers
88 Try This
5 0101 4 Two’s Complement ? ? ?
89 Overflow
5 0101 4 Two’s Complement 0100 -7 1001 Decimal
90 Excess Notation
91 Represent Floating-Point using Excess Notation (8 Bits)
. Mantissa x 2 Exponent 01101011 .10112110 10.11 2 3 4 92 Fraction to Floating Point
3/ 8 .01102100 01000110 3/ 8 .11002011 00111100 0 .00002000 00000000
93 ‘number of bits’ matters!
94 Loss of Digits
When adding two numbers, the ‘Exponent’ component needs to be the same, but this then could result in ‘loss of digits’
A x 2 B = A (right shift n bits) x 2 B +n e.g. 1000 x 20 = 0001 x 23
2 1 1 1 10.1 0.001 0.001 2 8 8 01101010 00011000 00011000 01101010 01100000 01100000 01101010 2 1 2 95 Chapter 1: Data Storage
1.1 Bits and Their Storage 1.2 Main Memory 1.3 Mass Storage 1.4 Representing Information as Bit Patterns 1.5 The Binary System 1.6 Storing Integers 1.7 Storing Fractions 1.8 Data Compression 1.9 Communications Errors
96 Why compression?
Save storage Save energy/power Save money
Chapter 2 97 Generic Data Compression
Run-length encoding Relative encoding Frequency-dependent encoding Adaptive dictionary encoding
98 Run-Length Encoding
Replace a sequence with The repeating value The number of times it occurs in the sequence Example: 1111111111100000000001111111100 (1, 11), (0, 10), (1, 8), (0, 2)
99 Relative Encoding
Record the differences between consecutive data blocks Each block is encoded in terms of its relationship to the previous block e.g. 91,94,95,95,98,100,101,102,105,110 => -9,-6,-5,-5,-2,100,1,2,5,10 Real-life Example Consecutive frames of a motion picture (MP4) 100 English letter frequency
Chapter 2 101 Frequency-Dependent Encoding
Variable-length codes “a” represented by 01 “q” represented by 10101 The length of the bit pattern used to represent a data item is inversely related to the frequency of the item’s use x: data item F(x): frequency of x appearing C(x): code to represent the data item |C(x)| 1/F(x)
102 The Idea
Letters e, t, a, i are used more frequently than letters z, q, x Use short bit patterns for letters e, t, a, i, and longer patterns for z, q, x
103 A Simple Example
A 4-symbol case Symbol A B C D
Without special code Symbol A B C D Code 00 01 10 11
104 Huffman Coding
A 4-symbol case Symbol A B C D
Probability 0.75 0.1 0.075 0.075
Results Symbol A B C D Code 0 10 110 111 R 0.75*1 0.1*2 0.15*3 1.4 bits
105 Saving with Huffman Coding
Normally 2 bits/sample for 4 symbols Huffman coding 1.4 bits/sample on average
106 Dictionary Encoding
searching for matches between the text to be compressed and a set of strings contained in a data structure (called the 'dictionary‘). When the encoder finds such a match, it substitutes a reference to the string's position in the data structure. Static dictionary vs. adaptive dictionary
Chapter 2 107 Static dictionary encoding
one whose full set of strings is determined before coding begins and does not change during the coding process often used when the message to be encoded is fixed and large
Chapter 2 108 Adaptive Dictionary Encoding
Dictionary Collection of building blocks from which the message being compressed is constructed Frequent patterns Dictionary is allowed to change during the encoding process Future occurrence as a single reference to the dictionary
109 Lempel-Ziv (LZ77) Decoding: xyxxyzy (5, 4, x)
110 Another Example
Decoding xyxxyzy (5,4,x)(0,0,w)(8,6,y) xyxxyzyxxyzx(0,0,w)(8,6,y) xyxxyzyxxyzxw(8,6,y) xyxxyzyxxyzxwzyxxyzy
111 Lempel-Ziv (LZ77) Encoding
For a message, divide it into A text window (~Kbytes) followed by A look-ahead buffer (10~1000 Bytes) Then find the longest segment In text window that agrees with a pattern (> 2) Start from the beginning of the window In the look-ahead buffer Use a triplet, (,,) to encode the segment in the look-ahead buffer Slide the window and buffer and repeat similar process
112 Lempel-Ziv Encoding: LZ77
Look-ahead buffer: begin with 7 Text window: 8 Encoding xyxxyzyxxyzxwzyxxyzy xyxxyzyxxyzxwzyxxyzy xyxxyzy(5, 4,) xyxxyzyxxyzxwzyxxyzy xyxxyzy(5, 4, x) xyxxyzyxxyzxwzyxxyzy xyxxyzy(5, 4, x)(0, 0, w) xyxxyzyxxyzxwzyxxyzy xyxxyzy(5, 4, x)(0, 0, w) (8, 6,) xyxxyzyxxyzxwzyxxyzy xyxxyzy(5, 4, x)(0, 0, w)(8, 6,y) xyxxyzyxxyzxwzyxxyzy 113 Compressing Images
Images 3-byte-per-pixel (RGB) Large, unmanageable maps Compression techniques GIF (Graphic Interchange Format )
Use LZW compression (extension of LZ77) JPEG (Joint Photographic Experts Group)
114 JPEG’s Lossless Mode
No information lost in encoding the picture Code the difference between adjacent pixels (relative encoding) The differences are encoded using a variable-length code (frequency-dependent encoding) Leads to storage size not manageable Seldom used
115 JPEG’s Baseline Standard
Human eye is more sensitive to changes in brightness than to change in color Encoding each brightness component but dividing the image into four-pixel blocks and recording only the average color of each block
116 JPEG’s Baseline Standard
Save additional space by recording brightness and color change between components (relative encoding) The final bit pattern is further compressed by applying variable-length encoding system Compression ratio about 20 to 1
117 Compressing Sound
MP3 MPEG-1 Audio Layer 3 Sound are composed of sine waves of different frequencies Allocate number of bits according to the responses of human ear to sound in different frequencies (Huffman coding) Compression ratio about 12 to 1
118 Chapter 1: Data Storage
1.1 Bits and Their Storage 1.2 Main Memory 1.3 Mass Storage 1.4 Representing Information as Bit Patterns 1.5 The Binary System 1.6 Storing Integers 1.7 Storing Fractions 1.8 Data Compression 1.9 Communications Errors
119 Why Errors happen
Temperature Communication media Cable vs. wireless Interference
Chapter 2 120 Communication Errors
Communication Transmission of data Data bits Error bit pattern transferred received Error encoding techniques Allow detection Allow correction of errors
121 Adjusted for Odd Parity
Making sure there’s an odd number of ‘1’s
‘A’ ‘F’
Problem: can only detect single bit error!!! 122 Error Correction Codes
So that errors can be Not only detected But also corrected (probabilistically)
Hamming distance The number of bits in which two bit patterns differ
123 Error-Correcting Code
Hamming distance between A and B is 4
Any two patterns are separated by a Hamming distance of at least 3
An error bit will result in an illegal pattern
124 Decoding 010100
125 Questions?
126