The Relational Model Came Before Sqland Created a Need for SQL.The Power of SQL Liesnot in the Languageitself but in the Conceptsset Forth in the Relationalmodel
Total Page:16
File Type:pdf, Size:1020Kb
fi,'1.,.!r'$ tt Uqt has a very practical exterior but a very theoretical interior. That interior is the relational rnodel.The relational model came before SQLand created a need for SQL.The power of SQL liesnot in the languageitself but in the conceptsset forth in the relationalmodel. These concepts form the basisof SQL'sdesign and operation. The relational model is a powerful and elegantidea that has pervadednot only computer sciencebut our daily lives aswell, whether we know it or not. Like the automobile, or penicillin, it is one of the many great examplesof human thought and discoverythat has fundamentally impacted the way the world works. The relational model has spawneda billion-dollar industry, and has become an integral part of almost every other industry. You'll seethe relational model at work nearly everywherethat dealswith information of any kind-in Fortune 500companies, universities,hospitals, grocery stores,websites, routers, MP3 players,cell phones, and even smart cards.It is truly pervasive. Background The relationalmodel wasborn in 1969,inside of IBM. A researchernamed E. F. Codd distributed an internal paper titled "Derivability, Redundancy,and Consistencyof RelationsStored in LargeData Banks,"which defined the basic theory of the relational model. This paper was circulated internally and not widely distributed. In 1970,Codd published a more refined version of this paper in Communicationsof theACM called "A RelationalModel for large SharedData Banks,"which is more widely recognizedas the seminal work on relational theory. This is the paper that changedthe industry. Despite the enormous influence of this paper, the relational model in its entirety did not appearovernight, or within the scopeof a singlehistoric paper. It evolved,grew, and expanded over time and was the product of many minds, not just Codd alone.For example,the term "relational model" wasn't coined until 1979,nine yearsafter publication of the paper proposing it (seeDate, 1999,in the Referencessection). Codd's 12 rules, now famousas the working defi- nition of relational, weren't posited until 1985,appearing in a wvo-partseries published by Computerworldmagadne.Thus, while there was a seminal paper proposing the generalconcept, much of the relational model aswe know it today is actually the result of a seriesof papers,articles, and debatesby many people over a 30-yearperiod, and indeed continues to develop to this very day. That is alsowhy in some of the various quotes included in this chapter you may see terms such as "data base,"which may initially appear to be a typo. They aren't. At the time these statementswere made, the terminology still had not congealedand all sorts of various terms were being thrown around. 47 48 CHAPTER3 :.,;,THE RELATI0NAL M0DEL The Three Components This chapteri of it. It is merelya enoughwas known the relational principle By 1980, about model for Codd to identiff three presentsonly the r components: underpinnings of , . The structural component: This component defineshow information is structured, or in generaland in t represented.Specifically, all information is representedas relatioru,which are composedof splendor is far bel tuples,which in turn are composed of attribute and ualuecomponents. The relation is stamina to fully dr the sole data structure used to representall information in the database.Relations as learn more. The ru defined in the relational model are derived from relations in set theory, a branch of Computerworldat mathematics, and sharemany of their properties.They are formally defined in Codd's 1970paper. TheStruc . The integrity component: This component definesmethods that enforce relationships within and betweenrelations (or tables)in the structural component. Thesemethods The structural cor are called constraints,and are expressedin the form of rules. There are three principle componentsbuil( types of integrity: domain integrity, governingvalues in columns; entity integrity, governing first of Codd's 12r rows in tables;and referentialintegrity, goveming how tablesrelate to one another. Integrity has no analogin set theory, but rather is unique to relational theory. It was initially The Informt addressedin Codd's 1970paper and greatlyexpanded upon in the 1980sby Codd The first rule, call and others. defined as follows . The manipulative component: This aspectdefines the methods with which to operate on or manipulate information. Like relations, theseoperations also have their roots in 1. The Inforr mathematics.They are formalized in relational algebraand relational calculzsas origi- explicitly at tt nally presentedin Codd's 1972paper "RelationalCompleteness of Data Base Sublanguages." Date summarizes SQL,likewise, is structured similarly along theselines-so similarly, in fact, that it is hard Theentire inJ to talk about SQLwithout addressingthe relational model to somedegree, directly or indirectly. namely as exl It is true that SQLis for the most part a straighdorward language,which to many people appearsErs an island unto itself. But in reality, it is the offspring of the relational model, and in manyways 'l'hereare naroimg is clearly a reflectlon of it. Iogicalrepresenta within it. It is a kin SQLand the RelationalModel tlepictionof data, This chapterpresents the theoreticalroots of SQL,and examinesthe power and elegance The view pr behind SQL.It preparesyou to dealwith SQLnot in isolationbut in the contextof the relational aremade u you powerfi,ll, model. If notNng else,it shouldgive an appreciationfor how elegant, and complex The view is a beasta relational databasereally is. You may be surprisedby what eventhe most rudimentary or hardwar relational databasesare capableof. You don't haveto read this chapter in order to understandSQL; it merely providesa theoretical The logical re and historical backdrop.To this end, I presentthe theoretical aspectsin terms of severalinfluential rlatabaseis imple papersand articleswritten by Codd between lg70 and 1985that define the essenceof the relational l:rttercomponent model, alongwith the ideasand work of other contributorsto the field. Two of Codd'spapers rq)resentation-t mentioned here are availableonline, and are listed in the Referencessection at the end of the tlbles in a differe chapter.Again, others contributed to the developmentof the relationalmodel, but this chapter lrhysicalrepresen will draw primarily from Codd's work. tlreway in which rlcnt of physicalr Fi.:ffi CHAPTER3 tir,THE RELATI0NAL M0DEt 49 This chapteris not an exhaustivetreatment of the relationalmodel, or a completehistory of it. tt is merelyan appetizer.The SQLchapter that follows is the main course.This chapter ntiffthree principle presentsonly the minimal materialneeded to get an appreciationof the origin and theoretical underpinningsof SQL.We illustratethe relationalmodel in termsof both its threecomponents ationisstructured, or in generaland in the contextof Codd's 12rules in particular.The relationalmodel in its full whicharecomposed of splendoris far beyond the scopeof this book, and I haveneither the qualificationsnor the nents,The relation is stamina to fully describeit. Seethe Referencessection at the end of this chapter if you want to tabase.Relations as learn more. The rulesas they are presentedhere are taken directlyfrom Codd'sOctober 14,1985, heory,abranch of Comp uter wor ld article. dlydefinedin Codd's TheStructural Gomponent enforcerelationships rent.These methods The structuralcomponent of the relationalmodel laysthe foundationupon which the other reare three principle components build. It definesthe form in which information is represented.It is defined by the iryintegrity, governing first of Codd's 12rules, which is the cornerstoneof the relational model. rofleilother.Integrity ry,ltwasinitially The Information Principle he1980s by Codd The first rule, called the Information RuIe,is also known as the Information Principle. It is defined as follows: /ithwhichto operate ;ohavetheir roots in I. The Information Rule. All information in a rel"ational data base is represented ywlcalculusasorigi- explicitly at the logical level and in exactly one way-by ualuesin tables. fDataBase Date summarizesthis rule as follows: infact,that it is hard The entire information content of the databaseis representedin one and only one way, ,dfectly or indirectly. namely as explicit ualuesin column positions in rows in tables. ranypeopleappeils as lel,and in manyways There are two import expressionshere: "logical level" and "valuesin tables."The logic level, or logical representation,refers to the way that you, the user, seethe databaseand information within it. It is a kind of ideal worldview for information. The logical level is a consistent,uniform depiction of data, which has Wvoimportant properties: verandelegance . The view presentedin the logicallevel consists of tables,made up of rows,which in turn ntextofthe relational are made up of values. or.verfrd,and complex . The view is completely independent of the databasesystem-the technolory (software hemostrudimentary or hardware) that enablesit. providesatheoretical The logical representationis a world unto itself,which is completely distinct from how the sofseveralinfluential databaseis implemented, or how it storesdata physically,or how it operatesinternally. These senceoftherelational latter components-software, operating system,and hardware-are referredto as the physical woofCodd's papers representation-the technology of the system.If the databasevendor decidesto store database ionat the end of the tables in a different