Lecture 4: Data Units, Data Storage

Lecture 4: Digital vs Analog Data: Data Units, Data Storage Devices and Data Transmission over Internet • Digital or binary data consists of a sequence of 0s and 1s – only two possible values. The analog data consists of continuous values – more like a decimal number. Why Digital ? Today, nearly all electronic devices we use are digital. Digital Data Units: The main reason for the change from analog to digital is because digital signals are easier to transmit and • A bit is the smallest unit of data. bit=binary digit. It offer less room for errors to occur. This leads to more represents either “1” or “0” – corresponding to an “on” accurate data transmission that in turn leads to faster or “off” state. All data in a computer is represented as a transmission rates and better productivity and quality. sequence (pattern) of bits. A single bit is not very useful for storing information. - 1 - - 2 - (The following will be covered in detail in Lecture 5.) • A group of 8 bits is called a byte. Since each bit can be • A document that contains plain text only (such as a Notepad file) is called either 0 or 1, there are 256 different bit patterns that an ASCII file or a text file. Each character of text is stored as one ASCII can be represented with 8 bits. ASCII is a standardized pattern, in one byte of memory. So a file containing 20 lines of text, with 100 characters per line, would be stored in 2000 bytes. scheme for representing characters in patterns of 8 bits. 8 (2 =256, more than enough for upper and lower case, • Other data can be stored in other ways. Files that contain data that is not digits and punctuation). Since we use bits in groups of plain text (e.g. Word documents which contain formatting information) are not stored as plain ASCII files. But the information is still stored in some 8, extra bit can be used for error-checking. type of binary format. They are called binary files. (What happens when you try to open a Word document in Notepad? Sometimes you see garbage • A word is sometimes use to refer to a group of bytes. characters on the screen, because those bytes don't correspond to ASCII codes.) The actual size of the word depends on the computer used and the data it measures. With most PC, the size • The size of a file = number of bytes stored in the file. or plain ASCII text of a word for an integer is 4 bytes. files, the size of the file = number of characters. Word processing documents are larger because of the extra formatting information that is part of the file. - 3 - - 4 - 10 Data Storage Devices (Memory types): • 1 kilobyte (KB) = 1,024 bytes = 2 bytes (example above, a file of 20 lines of text, about 100 chars per • Storage capacity specifies maximumly how much data line, would be about 2 KB) a storage device can hold. (e.g. 1.44MB, 2GB, etc.) • 1 megabyte (MB) = 1,024 KB = 220 bytes (about 1,000 • IC chips – main memory (cache/1MB, RAM/1GB, pages of text, each page 20 lines of 100 chars, would be fastest, most expensive, volatile), usb flash drive (2GB) about 2MB). A floppy disk can store 1.44MB which is • Magnetic disks (slower, cheaper, permanent storage) usually enough for several short text files. – data stored by polarization of eletrons • 1 gigabyte (GB) = 1,024 MB = 230 bytes – magnetized/un-magnetized spots represent 1s & 0s – concentric rings, same amount information on each • 1 terabyte (TB) = 1,024 GB = 240 bytes track, therefore lower storage capacity – floppy diskettes (1.44MB), Zip disk (750MB), Jaz disk (2GB), hard disks (64GB). - 5 - - 6 - • Optical disks (slowest, least expensive, non-volatile) File Compression: – reflective coating over plastic • Multimedia files are very large. So file compression – laser used to change reflective properties of surface techniques are used to save storage spaces (audio – (burning to create pits and bumps to represent 1s & 0s) mp3, video – flv, mpeg, … image – jpeg, etc. ) – spiral circle (higher capacity than concentric ring) – constant density (higher storage capacity) • “A picture is worth 1,000 words” – Actually, computer – CD (600~700MB), DVD (10GB) scientists would say that it is worth more! – 1,000 words, at an average of 5 bytes per word = 5,000 bytes = about 5KB. That's only enough for a very, very tiny picture. Most graphics on the web are over 30KB! • (How graphics are stored, high-resolution vs. low- resolution → tradeoff of image quality vs. storage Magnetic: Optic: space – will be covered in detail in Lecture 5.) - 7 - - 8 - Data Transmission Speed: Packet Switching for Data Transfer over Internet • On Internet or data communication lines, information is transferred as a sequence of bits (0s and 1s). • Packet switching is the approach used by Internet to deliver data across a local or long distance connection. • The speed of data transmission is measured in terms of • Data sent through the Internet is broken by the sending bps — bits per second, Kbps (1000bps), Mbps computer into small parts called packets. (1000Kbps), Gbps (1000Mbps), etc. The time it takes to download a file depends on the size of the file and ¾ Large messages can not monopolize the connection the speed of the transmission. ¾ Data won’t be stuck in a congested path ¾ If one packet is received, others also will (reliable) • To connect to the Internet from home, you can use • The sending computer adds overhead information to dialup modems (less than 56Kbps), DSL (3Mbps packets to form IP datagrams. IP datagrams have download), cable (15 Mbps download) or fiber optics headers like envelopes containing source IP addresses, (5~30 Mbps download). destination IP addresses, packet sequence numbers, etc. - 9 - - 10 - • Routers (what are they ?) receive the packets and examine destination addresses and route the packets appropriately. Routers use various information sources to determine the best path for each packet to follow. • (Analogy to mailing a book. Instead of mailing the whole book at once, mail each page separately. Put each page into its own envelope. They may arrive on different days, out of order; some may get lost.) • Different packets may take different routes. May arrive in different order, may be lost on the way. - 11 - - 12 - • The receiving computer unpacks packets and reconstructs the data – puts packets in order, discards the multiples. • The receiving computer sends back an acknowledgment signal to the sending computer for each packet it received. • The sending computer waits for all acknowledgements. If one is not received within a certain period of time, the sending computer resends the corresponding packet with a longer waiting period. This may result in multiple packets arriving at the destination. - 13 - .

Load more