To my mother, Behjat, and my father, Mohammad—though they were not given the opportunity to attend or finish school, they did everything they could to make sure that all of their seven children obtained college degrees. S. K. R. To my mother who taught me to love reading, learning, and books; my father who taught me to love mathematics, science, and computers; and the rest of my family who put up with me while we wrote this book —this would not have been possible without you. F. S. H. CONTENTS Preface xxv 1 Introduction 1 1.1 Database Concepts, 2 1.1.1 Data Models, 2 1.1.2 Database Operations, 2 1.1.3 Database Management, 3 1.1.4 DB Clients, Servers, and Environments, 3 1.2 DBE Architectural Concepts, 4 1.2.1 Services, 4 1.2.2 Components and Subsystems, 5 1.2.3 Sites, 5 1.3 Archetypical DBE Architectures, 6 1.3.1 Required Services, 6 1.3.2 Basic Services, 7 1.3.3 Expected Services, 8 1.3.4 Expected Subsystems, 9 1.3.5 Typical DBMS Services, 10 1.3.6 Summary Level Diagrams, 11 1.4 A New Taxonomy, 13 1.4.1 COS Distribution and Deployment, 13 1.4.2 COS Closedness or Openness, 14 1.4.3 Schema and Data Visibility, 15 1.4.4 Schema and Data Control, 16 1.5 An Example DDBE, 17 1.6 A Reference DDBE Architecture, 18 1.6.1 DDBE Information Architecture, 18 1.6.2 DDBE Software Architecture, 20 vii viii CONTENTS Components of the Application Processor, 21 Components of the Data Processor, 23 1.7 Transaction Management in Distributed Systems, 24 1.8 Summary, 31 1.9 Glossary, 32 References, 33 2 Data Distribution Alternatives 35 2.1 Design Alternatives, 38 2.1.1 Localized Data, 38 2.1.2 Distributed Data, 38 Nonreplicated, Nonfragmented, 38 Fully Replicated, 38 Fragmented or Partitioned, 39 Partially Replicated, 39 Mixed Distribution, 39 2.2 Fragmentation, 39 2.2.1 Vertical Fragmentation, 40 2.2.2 Horizontal Fragmentation, 42 Primary Horizontal Fragmentation, 42 Derived Horizontal Fragmentation, 44 2.2.3 Hybrid Fragmentation, 47 2.2.4 Vertical Fragmentation Generation Guidelines, 49 Grouping, 49 Splitting, 49 2.2.5 Vertical Fragmentation Correctness Rules, 62 2.2.6 Horizontal Fragmentation Generation Guidelines, 62 Minimality and Completeness of Horizontal Fragmentation, 63 2.2.7 Horizontal Fragmentation Correctness Rules, 66 2.2.8 Replication, 68 2.3 Distribution Transparency, 68 2.3.1 Location Transparency, 68 2.3.2 Fragmentation Transparency, 68 2.3.3 Replication Transparency, 69 2.3.4 Location, Fragmentation, and Replication Transparencies, 69 2.4 Impact of Distribution on User Queries, 69 2.4.1 No GDD—No Transparency, 70 2.4.2 GDD Containing Location Information—Location Transparency, 72 2.4.3 Fragmentation, Location, and Replication Transparencies, 73 2.5 A More Complex Example, 73 2.5.1 Location, Fragmentation, and Replication Transparencies, 75 2.5.2 Location and Replication Transparencies, 76 CONTENTS ix 2.5.3 No Transparencies, 77 2.6 Summary, 78 2.7 Glossary, 78 References, 79 Exercises, 80 3 Database Control 83 3.1 Authentication, 84 3.2 Access Rights, 85 3.3 Semantic Integrity Control, 86 3.3.1 Semantic Integrity Constraints, 88 Relational Constraints, 88 3.4 Distributed Semantic Integrity Control, 94 3.4.1 Compile Time Validation, 97 3.4.2 Run Time Validation, 97 3.4.3 Postexecution Time Validation, 97 3.5 Cost of Semantic Integrity Enforcement, 97 3.5.1 Semantic Integrity Enforcement Cost in Distributed System, 98 Variables Used, 100 Compile Time Validation, 102 Run Time Validation, 103 Postexecution Time Validation, 104 3.6 Summary, 106 3.7 Glossary, 106 References, 107 Exercises, 107 4 Query Optimization 111 4.1 Sample Database, 112 4.2 Relational Algebra, 112 4.2.1 Subset of Relational Algebra Commands, 113 Relational Algebra Basic Operators, 114 Relational Algebra Derived Operators, 116 4.3 Computing Relational Algebra Operators, 119 4.3.1 Computing Selection, 120 No Index on R, 120 B + Tree Index on R, 120 Hash Index on R, 122 4.3.2 Computing Join, 123 Nested-Loop Joins, 123 Sort–Merge Join, 124 Hash-Join, 126 4.4 Query Processing in Centralized Systems, 126 4.4.1 Query Parsing and Translation, 127 4.4.2 Query Optimization, 128 Cost Estimation, 129 x CONTENTS Plan Generation, 133 Dynamic Programming, 135 Reducing the Solution Space, 141 4.4.3 Code Generation, 144 4.5 Query Processing in Distributed Systems, 145 4.5.1 Mapping Global Query into Local Queries, 146 4.5.2 Distributed Query Optimization, 150 Utilization of Distributed Resources, 151 Dynamic Programming in Distributed Systems, 152 Query Trading in Distributed Systems, 156 Distributed Query Solution Space Reduction, 157 4.5.3 Heterogeneous Database Systems, 170 Heterogeneous Database Systems Architecture, 170 Optimization in Heterogeneous Databases, 171 4.6 Summary, 172 4.7 Glossary, 173 References, 175 Exercises, 178 5 Controlling Concurrency 183 5.1 Terminology, 183 5.1.1 Database, 183 Database Consistency, 184 5.1.2 Transaction, 184 Transaction Redefined, 188 5.2 Multitransaction Processing Systems, 189 5.2.1 Schedule, 189 Serial Schedule, 189 Parallel Schedule, 189 5.2.2 Conflicts, 191 Unrepeatable Reads, 191 Reading Uncommitted Data, 191 Overwriting Uncommitted Data, 192 5.2.3 Equivalence, 192 5.2.4 Serializable Schedules, 193 Serializability in a Centralized System, 194 Serializability in a Distributed System, 195 Conflict Serializable Schedules, 196 View Serializable Schedules, 196 Recoverable Schedules, 197 Cascadeless Schedules, 197 5.2.5 Advanced Transaction Types, 197 Sagas, 198 ConTracts, 199 CONTENTS xi 5.2.6 Transactions in Distributed System, 199 5.3 Centralized DBE Concurrency Control, 200 5.3.1 Locking-Based Concurrency Control Algorithms, 201 One-Phase Locking, 202 Two-Phase Locking, 202
