Principles of Database Management Wilfried Lemahieu , Seppe Vanden Broucke , Bart Baesens Index More Information

Cambridge University Press 978-1-107-18612-5 — Principles of Database Management Wilfried Lemahieu , Seppe vanden Broucke , Bart Baesens Index More Information INDEX aborted transaction, 432 analytics Apache Kylin, 621 absolute address, 366 applications, 667 Apache Lucene, 616 abstraction. See generalization data pre-processing, 669–672 Apache Spark access category, 82 economic perspectives background of, 652–653 access modifiers, 59 in- versus outsourcing, 664–704 GraphX, 658 access paths, 397 on-premises versus cloud MLlib, 656–657 access transparency, 525 solutions, 705–706 Spark Core, 653–654 accessibility dimension, 617 open-source versus commercial Spark SQL, 654–656 accessor methods, 208 software, 706–708 Spark Streaming, 657–658 accuracy dimension, 617 return on investment, 702–704 Apache Sqoop, 621 accuracy ratio (AR), 687 total cost of ownership, 702 Apache Storm, 658 ACID properties improving ROI of Apollo program, 97 defined, 15, 452–453 cross-fertilization, 713–714 application developer, 12 in loosely coupled systems, 535–538 data quality, 711–712 application programming interface in NoSQL, 538 management support, 712 (API) active DBMS, 232–236 new data sources, 708–711 classification ActiveX Data Objects (ADO), 468, 502 organizational aspects, 712–713 background, 462–463 activity services, 607–608 post-processing, 700–701 early binding versus late binding, actuator, 353 predictive model evaluation, 689–696 465–466 ADO.NET, 468–471, 502, 533–534 privacy and security embedded versus call-level, after images, 435 accessing internal data, 717 464 after trigger, 233 anonymization, 717–718 proprietary versus universal, agglomerative hierarchical clustering, definitions and considerations for, 463–464 693 714–715 object persistence aggregate functions, 157 encryption, 721 Enterprise JavaBeans, 484–488 aggregated data, 717 importance of, 714 Entity Framework, 498–499 aggregation label-based access control, Java Data Objects, 495–498 in EER model, 55 719–721 Java Persistence API, 488–494 mapping EER to relational, 137 RACI matrix, 715–716 object-relational mapping, in UML, 62–63 regulations, 721–723 483–484, 498 AJAX (Asynchronous JavaScript and SQL views, 719 SQLAlchemy, 499–502 XML), 508–509 process model, 665–666 universal database ALL, 178–181 success factors for, 701 ADO.NET, 468–471 allocation, 519, 523 types of embedded API versus embedded ALTER, 155–156 descriptive, 689–695 DBMS, 480–482 alternative keys, 109 predictive, 673–682 JDBC, 471–477 Amazon, 593 social network, 695–700 language-integrated querying, Amazon Redshift, 572, 621 analytics process model, 665–666 482–483 Amazon Relational Database Service anonymization (data), 717–718 ODBC, 466–467 (RDS), 621 ANY, 178–181 OLE DB and ADO, 467–468 Amazon Web Services, 706 Apache Flume, 621 SQL injection, 477–479 Amsterdam, 8 Apache Hadoop, 631 SQLJ, 479–480 © in this web service Cambridge University Press www.cambridge.org Cambridge University Press 978-1-107-18612-5 — Principles of Database Management Wilfried Lemahieu , Seppe vanden Broucke , Bart Baesens Index More Information Index 771 architecture categorization, 30–31 BASE transactions, 540 Boyce, Raymond, 120 architecture components Bayer, Rudolf, 388 Boyce–Codd normal form (BCNF), connection and security manager, BayesDB, 342 119–120 21–22 Bean-Managed Persistence (BMP), Brewer, Eric, 312–313, 539 DDL compiler, 22 488 B-tree, 378, 386–388 interacting with before image, 435 bucket, 365, 368–369 DDL statements, 21 before trigger, 233 buffer manager, 26 embedded DML statements, 21 begin_transaction instruction, 432 business activity monitoring (BAM), interactive queries, 21 behavior (OO), 244 593 interfaces, 27 BETWEEN, 159 business continuity query processor, 22–25 BFI, 384 contingency planning, recovery point storage manager, 25–26 bidirectional association, 61 and recovery time, 398–421 utilities, 26 Big Data defined, 421 archiving, 438 Apache Spark business intelligence (BI). See also arcs, 333 background of, 652–653 decision-making area under the ROC curve (AUC), 685 GraphX, 658 defined, 572 ASP (Active Server Pages), 506 MLlib, 656–657 hybrid OLAP, 575 association class, 60 Spark Core, 653–654 multidimensional OLAP, association rules Spark SQL, 654–656 574–587 basic setting, 689–690 Spark Streaming, 657–658 on-line analytical processing, 574 defined, 689 data integration outlook, 621 operational BI, 592 post-processing, 691 defined, 627 pivot tables, 573 support, confidence, and lift, 690–691 Hadoop query and reporting, 573–587 associations, 59–61, See also definition and design, 630 relational OLAP, 575 relationship type history of, 630–631 business process associative query, 221 SQL, 643–652 defined, 601 Asynchronous JavaScript and XML stack, 631–643 in database design, 38 (AJAX), 508–509 scope of business process integration atomic attribute type, 42 value, 627–629 data and process integration in, atomic literal, 217 variety, 627–629 606–610 atomic search key, 402–404 velocity, 627–629 defined and modeling, 601–602 atomicity property, 15, 452, 530–532 veracity, 627–629 managing dependencies, attribute type volume, 627–629 604–606 defined, 40 BigQuery ETL, 621 manual processes, 602–604 in ER model, 42–43 binary large object (BLOB), 155–156, in file organization, 362 247, 360–361 CallableStatement, 475–476 in index creation, 400 binary relationship type call-level APIs, 464 relationship, 46 cardinalities, 45 candidate key attributes, 57 defined, 44–45 defined, 108 authorization identifier, 150 mapped to a relational model, in file organization, 362 availability, 539 122–127 in index creation, 400 AVG, 162 and ternary types, 48–50 canonical form, 526 Axibase, 342 binary search, 364 CAP theorem, 312–313, 539 binary search trees, 385–386 Capability Maturity Model Integration B+-trees, 378, 388–389, 402–404 biometric data, 3 (CMMI), 619–620 Bachman diagram, 98 bitmap index, 383 cardinalities. See also multiplicities backup, 438 BLOB (binary large object), 155–156, CODASYL model, 101 backup and recovery utility 247, 360–361 ER model, 45 and data availability, 423–425 block, 397 Cartesian product, 108 as database advantage, 15 block-level I/O protocols, 415 cascading rollback, 447–448 defined, 27 block pointer, 371 catalog Baesens, B., 82, 85 blockchains, 524 and metadata role, 80 bag. See multiset blocking factor (BF), 361 data types in, 401 BASE principle, 312 bootstrapping, 683–684 defined, 10–11 © in this web service Cambridge University Press www.cambridge.org Cambridge University Press 978-1-107-18612-5 — Principles of Database Management Wilfried Lemahieu , Seppe vanden Broucke , Bart Baesens Index More Information 772 Index categorization Codd, Edgar F., 104–105, 120 consistency based on architecture, 30–31 coefficient of determination, 688 in CAP theorem, 539 based on data model, 28–30 collection literal, 218 eventual, 312–313, 540 based on degree of simultaneous collection types (OO), 245–247 quorum-based, 542–544 access, 30 collision, 366 consistency dimension, 82–84 based on usage, 31–32 column constraints, 151 consistency in databases, 300–301 in EER model, 54–55 column value, 399 consistency property, 15, 452 mapping EER to relational, 136–137 column-oriented DBMS, 331–332 consistent hashing, 309–310, 538 central processing unit (CPU), 352 combination notation, 404 constructor, 210 central storage, 352 combined approach, 270–271 container managed relationships centrality metrics, 696–698 commercial analytical software, (CMR), 488 centralized DBMS architecture, 706–708 container-managed persistence (CMP), 459–460 committed, 432 488 centralized system architecture, 30 common gateway interface (CGI), contextual category, 82 chaining, 370 504–507 contingency plan, 398–421 changeability property, 65 Common Language Runtime (CLR), Control Objectives for Information and changed data capture (CDC), 598 468–471 Related Technology (COBIT), character large object (CLOB), 247, compatibility matrix, 445 620 360–361, 610 compensation-based transaction models, correlated nested queries, 175–178 CHECK constraint, 151 535–538 cost-based optimizer, 400 checkpoints, 435 completeness constraint, 53 COUNT, 161, 222–223 Chen, Peter Pin-Shan, 40 completeness dimension, 82–84 credit scoring models, 667 choreography, 604–606 composite aggregation, 62–63 cross-table, 573 churn prediction, 667 composite attribute type, 42 cross-validation, 682–683 class, 57 composite key, 362 CRUDS functionality, 608 class diagram, 58 comprehensibility, 689 CUBE, 577–578 class invariant, 65 conceptual data model cube (three-dimensional), 575 classification advantages of, 13 cumulative accuracy profile (CAP), defined, 673 defined, 9 686–687 performance measures for, 684–687 in design phase, 39 cursor mechanism, 474 classification accuracy, 685 EER, 52–57 customer relationship management cleansing, 566 ER, 40–52, 121–133 system (CRM), 5, 628 client–server DBMS architecture, 31, not stored in catalog, 10 customer segmentation, 667 459–460 physical design architecture, cutoff, 685 client-side scripting, 507–508 357–358 Cutting, Doug, 630 CLOB (character large object), 247, UML class diagram, 57–66 cylinder, 354 360–361, 610 conceptual/logical layer, 10 Cypher cloud DBMS architecture concurrency control in graph-based database, 334 analytics, 705–706 defined, 6, 14–15 overview of, 335–341 data in the, 600–601 in distributed databases, 528–534 data warehousing in, 572 locking protocol, 444–452 data access request, 717 defined, 31 multi-version, 541–542 data accessibility, 84 tiered system architectures, 462 optimistic and pessimistic schedulers, data accuracy, 82

Principles of Database Management Wilfried Lemahieu , Seppe Vanden Broucke , Bart Baesens Index More Information

Master Data Management Whitepaper.Indd

Value and Implications of Master Data Management (MDM) and Metadata Management

Master Data Management Services (MDMS) System

What to Look for When Selecting a Master Data Management Solution What to Look for When Selecting a Master Data Management Solution

Centralized Or Federated Data Management Models, IT Professionals' Preferences

IBM Infosphere Master Data Management

Practical Fundamentals for Master Data Management How to Build an Effective Master Data Capability As the Cornerstone of an Enterprise Data Management Program

Essential Elements of a Master Data Management Architecture

Learning Data Modelling by Example Chapter 9) Master Data Management

Master Data Management (MDM) Improves Information Quality to Deliver Value

Optimize Master Data Management Through Federated Data Governance

IBM Industry Models and IBM Master Data Management Positioning And