Database Management System Market Dynamics Why Dbmss Have a Resurgent Role in the Application Platform Landscape
Total Page:16
File Type:pdf, Size:1020Kb
Database Management System Market Dynamics Why DBMSs have a resurgent role in the application platform landscape. by Peter O'Kelly October 15, 2004 Database management systems (DBMSs) have a fundamental role in application platforms, but there is currently a lot of market confusion about how, when, where, and why DBMSs should be used. This column provides an overview of DBMS trends and the reasons why DBMSs have a resurgent and expanding role in the broader application platform landscape. The next Trends & Analysis column will assess how Microsoft's SQL Server 2005 product family fits into both the emerging DBMS market landscape and Microsoft's overall Windows Server System strategy. A Brief History of DBMSs To establish context, it's useful to briefly review the fundamental DBMS value proposition. DBMSs are used to securely and robustly manage databases; databases are sets of data that capture descriptions of real-world things such as customers and product inventory. DBMSs embody sophisticated technology to efficiently and concurrently make data available to applications and users without compromising database integrity. The DBMS market has evolved through several generations, starting with hierarchical DBMS products such as IBM's IMS, which was introduced during the late 1960s. Network DBMS products (also known as CODASYL) came along next, with Cullinet's IDMS (today a Computer Associates "legacy" DBMS) serving as a leading example. Relational DBMSs entered the market approximately 25 years ago and, after a phase of "database wars" between relational and network DBMS products, have dominated the DBMS market for most of the past 20 years. IMS and IDMS are still used for legacy applications, but today the vast majority of database developers work with relational DBMSs (and hereafter "DBMS" refers to relational DBMS unless otherwise noted). DBMS evolution led to higher levels of abstraction for database developers and users. While hierarchical and network DBMSs presented models that mixed logical and physical (implementation-oriented) considerations, for example, developers and users working with relational DBMSs focus primarily on logical abstractions that more closely mirror the real world things and events described by the database; they aren't conceptually burdened with pointer chains, buffer pools, and other low-level details. As a result, DBMSs are conducive to improved 1 From www.ftponline.com/wss/2004_11/magazine/columns/trends/default_pf.aspx 13 April 2005 developer productivity as well as increased overall system security and robustness. Perennial DBMS Challenges DBMSs offer more secure, robust, and productive options for data management, so why isn't all data stored in DBMSs today? It's estimated that 70 percent or more of most organizational data isn't currently stored in DBMSs, but is instead scattered across file systems, e-mail messages, and assorted specialized content/document management systems. Several longstanding challenges have prevented broader DBMS applicability and, as we'll see momentarily, the advent of Web-centric applications was in some respects a further setback for DBMSs. One major DBMS challenge, historically, had to do with cost and complexity. DBMSs, especially for high-end systems, were expensive to license and maintain. They required DBMS-trained developers and administrators, and also required extensive fine-tuning, which in turn resulted in protracted application development and test cycles. DBMSs also have more demanding hardware requirements than simpler, "good enough," file system-based alternatives, and that was a key consideration during the 1980s and 1990s, when hardware was still expensive relative to today's market. Another set of challenges stemmed from incomplete standards and limited data models. ANSI SQL has made major advances since 1999, but the previous version of the standard, published in 1992, was incomplete and led to DBMS vendors implementing proprietary SQL extensions. Until recently, DBMSs have also entailed constraints in terms of data model expressiveness, with most DBMS products, for example, unable to handle multivalued columns or recursive queries. As a result, DBMSs were considered appropriate for text and number "crunching" but not for documents or other, more elaborate data types. For most developers, today's world of persistent data is divided among three domains, as suggested in Figure 1. Databases are structured sets of data, designed to be used by applications. Documents are designed for human comprehension and include sequence, hierarchy, and narrative dimensions that aren't present in databases. Objects are programming abstractions that combine structure and behavior in a model optimized for developer productivity. Historically, the three domains were addressed with three largely distinct tool sets. DBMSs served databases, content and document management systems were for documents, and object-oriented programming tools and application servers fit with objects naturally. At the peak of client/server wave during the early 1990s, some DBMS products expanded to address some object-related capabilities as well as traditional databases. Illustra was a leading example, building on the UC Berkeley Postgres project, which in turn followed the pioneering Ingres research also led by Michael Stonebraker at UC Berkeley. 2 From www.ftponline.com/wss/2004_11/magazine/columns/trends/default_pf.aspx 13 April 2005 Figure 1. Three Application Domains. Historically, developers have used different tools when working with databases, documents, and objects. Sybase, another DBMS pioneer during the 1980s, led with triggers and stored procedures, putting more application logic into the DBMS. Triggers and stored procedures were important innovations because they meant application logic as well as data benefited from fundamental DBMS capabilities, e.g., making it possible to define a procedure for determining customer credit rating once and then have the procedure consistently applied by all applications rather than having the procedure done in each application. Several object-oriented database (OODB) products were also introduced during this period, and many people expected there would be another wave of "database wars," this time with relational being displaced by object-oriented DBMSs. OODB products failed to expand beyond niche status, however, and some have recently been creatively recycled, as we'll see in a moment. The use of different tools for different data models produced what has been termed an "impedance mismatch," creating challenges for developers who need to work with multiple tools and models. It has been difficult to use SQL with object-oriented programming tools, for example. Relegated to a Reduced Role in the Rush to the Web A funny thing happened on the way to the Web: As the rush to commercial Web applications started during the mid-1990s, DBMSs, which had until that time been evolving to become the center of the client/server application platform, were relegated to a reduced role. As Web applications shifted developers' focus to HTML-based pages (documents), often drawing on data from disparate systems, the market shifted to the five-tiered model depicted in Figure 2. 3 From www.ftponline.com/wss/2004_11/magazine/columns/trends/default_pf.aspx 13 April 2005 Figure 2. An Application Server-Centric View of the Platform Stack. The Web application wave reduced the role for DBMSs, often shifting business logic to application servers and integration to integration brokers. The Web application-led transition produced some "conventional wisdom" that was perplexing for many DBMS-focused developers. It became a common practice to create super-user database identities and to optimize database connection pooling, for example, shifting identity, authentication, and authorization to the application (or application-server) level. This was often done for performance, as DBMS deployments that were designed to support tens or hundreds of concurrent users often couldn't scale to serve the exponentially larger user populations of successful Web applications. Another best-practice shift was a movement away from DBMS-managed triggers and stored procedures, with application/business logic migrating to the middle tier, in application servers. This was a pragmatic option when applications had to work with data from disparate data sources, but it meant moving application logic from DBMSs, where it was consistently used by all applications, to the middle tier where it was easier to (inadvertently or deliberately) circumvent. The Web application wave also resulted in extensive midtier data caching, a practice with serious implications for database integrity. In some respects, this shift was a step backward to earlier approaches, when separate transaction processing monitor and DBMS layers were widely used, except the application servers weren't as tightly integrated with DBMSs. 4 From www.ftponline.com/wss/2004_11/magazine/columns/trends/default_pf.aspx 13 April 2005 XML has also impacted DBMS usage patterns. Although XML is still relatively young (the W3C completed its XML 1.0 recommendation in February 1998), XML documents have rapidly grown into a pivotal role in interapplication data exchanges. Several XML database products have been introduced to address the need for more robust, secure, and scalable XML data management, but, as with earlier OODB products, the XML database products have been niche offerings and not poised to displace DBMSs. Indeed, many XML database products