Chapter 16 Disk Storage, Basic File Structures, Hashing, and Modern Storage

Chapter 16 Disk Storage, Basic File Structures, Hashing, and Modern Storage - Databases are stored as files of records stored on disks - Physical database file structures - Physical levels of three schema architecture 1 - The collection of data in a DB must be stored on some storage medium. The DBMS software can retrieve, update, and process this data as needed - Storage media forms a hierarchy 2 -primary, secondary, tertiary, etc.. - offline storage, archiving databases (larger capacity, less cost, slower access, not directly accessible by CPU) Memory Hierarchies and Storage Devices - Cache, static RAM (Prefetch, Pipeline) - Dynamic RAM (main memory( Secondary and Tertiary Storage -mass storage (magnetic disks, CD, DVD (measured in KB, MB, TB, PB - programs are in main memory (DRAM) -permanent databases reside in secondary storage - main memory buffers are used to read and write to secondary storage - Flash memory: non volatile, NAND and NOR flash based - Optical disks: CDs (700MB) and DVDs (4.5 – 15GB), Blue Ray (54GB) - Magnetic Tapes and Juke Boxes Depending upon the intended use and application requirements, data is kept in one or more levels of hierarchy 3 Storage Organization of Database -Large amount of data that must persist for a long period of time (called persistent data) - parts of this data are accessed and processed repeatedly during the storage period - transient data during the period of execution - most DBs are stored on secondary storage (magnetic disks) - DB is too large to fit in main memory - permanent loss on disk is less likely - less cost on disk than primary storage 4 5 6 - A range of cylinders have the same number of sectors per arc. - A common sector size is 512 bytes - A division of a track into equal sized disk blocks (or pages) is set by OS during formatting - Fixed block size can’t be changed dynamically - Block sizes 512b – 8192b - Blocks are separated by fixed size interblock gaps - Storage capacity and transfer rates improving all the time, also cost is down at the same time ($100/TB) Disk - Random access addressable device - Transfer from disk to main memory is in units of blocks - Hardware address of block consists of (cylinder#, track#, block#) - Modern disks have a single number called LBA (logical block address) - The LBA 0 – n-1 is mapped to the right block on the disk - The LBA maps to a contiguous address in main memory - One block at a time or a cluster to transfer - Disk controller controls the disk drive - Standard interface from a computer to a disk is called SCSI (small computer system interface) - Connection of HDDs, CDs and DVDs to a computer is through SATA (Serial AT attachment), 16 bit IBM AT bus), 1.5Gbps – 6Gbps - New SATA is NL-SAS (nearline SAS) - The controller accepts high level I/O commands and takes appropriate action to position the arm and cause read/write - Seek time 5-10msec - Rotational latency 4msec - Block transfer time 7 - Transfer several consecutive blocks on the same track or cylinder to be effective (avoids seek time and rotational latency for blocks except the first one, total time 9-60msec, subsequent blocks 0.4 to 2mses) - Locating data on a disk is a major bottleneck – need efficient techniques to do this… Making Data Access More Efficient on Disk (1) Buffering a. Mismatch of speeds of CPU and disks b. Application using current data and I/O fetching new data to the buffer (2) Organization a. Use contiguous cylinders and tracks b. Avoid movement of arm and seek time (3) Prefetch a. Read data ahead of request b. Read consecutive blocks on tracks or cylinders though not needed c. May not be efficient for random data (4) Scheduling a. Proper scheduling of I/O requests b. Efficient scheduling algorithms (e.g elevator) (5) Use Log Disks a. Log disks to hold data temporarily b. Single disk used to hold logging of writes c. All blocks go to disk sequentially, avoiding seek time d. Place data and log files on the log disk e. Not possible to do for most applications 8 (6) Use Flash Memory a. Use SSDs or Flash memory instead of hard disks b. Do writes and updates to battery backup DRAM c. Later save to hard disk 9 10 Solid State Device Storage (SDD) Use flash memory as intermediate storage enterprise flash drives (EFDs) Magnetic Tape Storage Devices - Sequential access devices to access nth block on tape - Read/write head is used to access tapes - Used for backup and recovery Buffering on Blocks When several blocks to be transferred to memory and all the block addresses are known, several buffers can be reserved in memory to speed up the transfer. When one buffer being read/written by I/O, CPU can process other buffer. 11 - Processes A, B are running concurrently in interleaved fashion, C, D are running in parallel. - Use of two buffers shown in Fig. 16.4. File A is in one buffer and File B is in another buffer (double buffering) - Double buffering permits contiguous reading or writing of data blocks, thus reducing seek time. 12 Buffer Management - It is impossible to bring all data into memory at the same time - Buffer is a part of main memory that is available to receive blocks or pages of data from disk - Buffer manager is a software component of a DBMS, which manages buffers. It knows, which pages to bring and which buffer to use 13 - The size of the shared buffer pool is a parameter for the DBMS controlled by DBAs Two kinds of buffer management: 1. Controls the main memory directly (RDBMS) 2. Allocates buffers in virtual memory (OS Control), OODBMS Goals: 1. Maximize probability that a requested page is found in main memory 2. Efficient page replacement algorithm Keeps Information: 1. A pin-count (number of requests or number of current users); If the count is 0, it is unpinned; a pinned block should not be allowed to write to disk 2. A dirty bit a. a dirty bit is set when a page is updated by any application program b. make sure no of buffers fit in main memory c. if the requested amount exceeds buffer pool, use page replacement d. if the space is in virtual memory, OS thrashing may happen e. if the requested page is already in the buffer pool, increment pin count f. if the page is not in the buffer pool: i. choose a page replacement 14 ii. if dirty bit is on in the replacement page (old copy is on the disk), use the slot for a new page and copy the data and release the buffer to an application. Buffer Replacement Strategies 1. LRU (least recently used); maintain a time stamp; least used page is replaced 2. Clock priority; round robin variant of LRU; flag 0 or 1; if 0, use it; if 1, reset to 0, if dirty bit is set then write to disk Flag 0 or 1 in each slot 3. FIFO a. Notes the time each page loaded into memory b. Simple approach c. It may bring back the same block (sometimes) LRU and Clock policies best policies for DB applications 15 Placing File Records on Disk Set of records are organized into set of files. Records and Record Types: - Data is in the form of records - Each record consists of collection of related data values or items (corresponds to a field) Record type is a collection of records Record structure is an entity Data type is associated with each field Standard data types: integer, long, float, char, …. Other data types: date, time, … struct employee { Char name[30]; Char ssn[9]; Int salary; Int job-code; Char department[20]; }; Database also have to store unstructured data (binary large objects, BLOBs), digital images, videos as pointers to the blobs included in the record. 16 Files, Fixed and Variable Lengths - Same record type in a file - If every record is same size, then it is called fixed length record - If different records have different lengths, it is called variable length records o Variable length fields (name) o Repeating fields, or repeating group fields o Different types of records o Separator characters are used for variable length fields o If too many fields, but less actual fields; then . <field name, field value> format is used . <field type, field value> - Repeating fields; one char to separate values; one char to separate fields and one char to terminate; (= , ||, #) - These characters are the part of the file system, but hidden from the programmer (0x0d and 0x0a) Record Blocking Records are stored in blocks (sectors) Block size B Record size R Unit of transfer from disk to memory is a block If B > R, bfr (blocking factor) = Ɩ B/R ɺ records per block (integer division) If it does not divide evenly, unused space is: 17 B – (bfr * R) bytes To utilize space, a record may be spanned in two blocks: If R > B spanned record; number of blocks needed for a file of r records: b = ɾ r/bfr ɿ blocks (next integer value) Allocation of Files on Disk - Contiguous - Linked - Index - (clusters and extents) File Headers - Contains information about files (disk addresses, record format descriptions) - Records are copied into memory and searched one block at a time 18 19 20 Contiguous Allocation 21 Linked Allocation 22 Indexed Allocation 23 Operations on Files - Retrieval - Updates A simple or compound selection conditions are used: Ssn = ‘12345678’ Department = ‘Research’ Salary > 30000 Complex conditions must be decomposed into simple conditions to locate records on the disk. A high level programs like DBMS software use file operations such as: - Open - Reset - Find (or Locate) - Read (or Get) - Find Next - Delete - Modify - Insert - Close - Scan (returns first or next record) - FindAll - FindChar (Record at a time operations except reset and close) 24 Files of Unordered Records (Heap) - Records are placed in a file the way they arrived and inserted, new records are placed at the end; This arrangement is called HEAP.

Chapter 16 Disk Storage, Basic File Structures, Hashing, and Modern Storage

Chapter 12: Mass-Storage Systems

Olympus Optical Disc Archiving Systems & Discstor 900 Optical

Secure Data Storage – White Paper Storage Technologies 2008

Archiving Online Data to Optical Disk

The Future of Data Storage Technologies

Digital Preservation Guide: 3.5-Inch Floppy Disks Caralie Heinrichs And

CS100: Introduction to Computer Science

Unit 5: Memory Organizations

ETERNUS DX60/DX80 Disk Storage System User Guide

Disk Storage Access with DB2 for Z/OS

Hard Disk Drive Specifications Models: 2R015H1 & 2R010H1

The Use of Write-Once Read-Many Optical Disks for Temporary and Archival Storage