CS 5614: Basic Data Definition and Modification in SQL 67
Introduction to SQL
Structured Query Language (‘Sequel’) Serves as DDL as well as DML
Declarative Say what you want without specifying how to do it One of the main reasons for commercial success of DBMSs
Many standards and implementations ANSI SQL SQL-92/SQL-2 (Null operations, Outerjoins etc.) SQL3 (Recursion, Triggers, Objects) Vendor specific implementations
“Bag Semantics” instead of “Set Semantics” Used in commercial RDBMSs
CS 5614: Basic Data Definition and Modification in SQL 68
Example:
Create a Relation/Table in SQL
CREATE TABLE Students (sid CHAR(9), name VARCHAR(20), login CHAR(8), age INTEGER, gpa REAL);
Support for Basic Data Types CHAR(n) VARCHAR(n) BIT(n) BIT VARYING(n) INT/INTEGER FLOAT REAL, DOUBLE PRECISION DECIMAL(p,d) DATE, TIME etc.
CS 5614: Basic Data Definition and Modification in SQL 69
More Examples
And one for Courses
CREATE TABLE Courses (courseid CHAR(6), department CHAR(20));
And one for their relationship!
CREATE TABLE takes (sid CHAR(9), courseid CHAR(6)); Why?
Can also provide default values
CREATE TABLE Students (sid CHAR(9), .... age INTEGER DEFAULT 21, gpa REAL);
CS 5614: Basic Data Definition and Modification in SQL 70
Examples Contd.
DATE and TIME Implementations vary widely Typically treated as strings of a special form Allows comparisons of an ordinal nature (<, > etc.)
DATE Example ‘1999-03-03’ (No Y2K problems)
TIME Examples ‘15:30:29’ ‘15:30:29.3875’
Deleting a Relation/Table in SQL
DROP TABLE Students;
CS 5614: Basic Data Definition and Modification in SQL 71
Modifying Relation Schemas
‘Drop’ an attribute (column)
ALTER TABLE Students DROP login;
‘Add’ an attribute (column)
ALTER TABLE Students ADD phone CHAR(7);
What happens to the new entry for the old records? Default is ‘NULL’ or say
ALTER TABLE Students ADD phone CHAR(7) DEFAULT ‘unknown’;
Always begin with ‘ALTER TABLE
Can use DEFAULT even with regular definition (as in Slide 69)
CS 5614: Basic Data Definition and Modification in SQL 72
How do you enter/modify data?
INSERT command
INSERT INTO Students VALUES (‘53688’,’Mark’,’mark2345’,23,3.9)
Cumbersome (use bulk loading; described later)
DELETE command
DELETE FROM Students S WHERE S.name = ‘Smith’
UPDATE command
UPDATE Students S SET S.age=S.age+1, S.gpa=S.gpa-1 WHERE S.sid = ‘53688’
CS 5614: Basic Data Definition and Modification in SQL 73
Domains
Domains: Similar to Structs and other user-defined types
CREATE DOMAIN Email AS CHAR(8) DEFAULT ‘unknown’; .... login Email // instead of login CHAR(8) DEFAULT ‘unknown’
Advantages: can be reused
junkaddress Email, fromaddress Email, toaddress Email, ....
Can DROP DOMAINS too!
DROP DOMAIN Email; Affects only future declarations
CS 5614: Basic Data Definition and Modification in SQL 74
Keys
To Specify Keys Use PRIMARY KEY or UNIQUE Declare alongside attribute For multiattribute keys, declare as a separate line
CREATE TABLE takes ( sid CHAR(9), courseid CHAR(6), PRIMARY KEY (sid,courseid) );
Whats the difference between PRIMARY KEY and UNIQUE? Typically only one PRIMARY KEY but any number of UNIQUE keys Implementor allowed to attach special significance
CS 5614: Basic Data Definition and Modification in SQL 75
Creating Indices/Indexes
Why? Speeds up query processing time
For Students
CREATE INDEX indexone ON Students(sid); CREATE INDEX indextwo ON Students(login);
How to decide attributes to place indices on? One is (typically) created by default on PRIMARY KEY Creation of indices on UNIQUE attributes is implementation-dependent In general, physical database design/tuning is very difficult! Use Tools: Microsoft SQLServer has an index selection Wizard
Why not place indices on all attributes? Too cumbersome for insertions/deletions/updates
Like all things in computer science, there is a tradeoff! :-)
CS 5614: Basic Data Definition and Modification in SQL 76
Other Properties
‘NOT NULL’ instead of DEFAULT
CREATE TABLE Students (sid CHAR(9), name VARCHAR(20), login CHAR(8), age INTEGER, gpa REAL);
Can insert a tuple without a value for gpa NULL will be inserted
If we had specified
gpa REAL NOT NULL);
insert cannot be made!
¢¡¤£¦¥¨§ © ¦¦
¤ "!$#&%('*),+.-&/10(2$35462748%(0(09':!;/&-<=!$#&>?2@2A-&%CBD2@>?2$ ;!FEG/&2$>:
/& &RV!;%(+J W48%X!;#W!$#&2 >?2$0CIY!;%C+L &IJ0L)"+.-&2$0CZ\[]#&2 ^&>N'!_%('9>N2$0(IP!;%C+L &IJ0.IL0(KJ2$`&>NIL3GIL QIL0CKL2$`&>NIL%CR_IJ &-AO1>?+.R$2$-1/&>?IJ0
U
46I¢
-&2$R@0(IL>NIY!$%XfD2h%( ¨ &IY!$/&>?2,k:'!;/1-&2$ ;!;']abIL)"%(0C%(IJ>l48%X!$#F!;#&2Qm n]oQp9o8qrO&>?+JKL>NIL)"),%C &K,0CIL &KJ/&ILKJ2h48%C0(0¦^1 &-
!;#1%('\IJ0(0!;+.+c &IP!;/&>NIL0(sNZt ¤ abIJRN!;3!;#&2@>?2uIJ>?2uR$0C+L'N2uR$+L>N>?2$'NOv+L &-&2@ &R$2$'w`D2V!V4u2$2$ -1IY!;IJ`&IL'N2Q'<¦'!;2@),'wIL &-
m n]oQp9o8qQZwm_n]oQp9oQqxR$IL y`v2z!;#1+L/&KJ# !=+Ja]IL'AIS-&IY!$IL`&IJ'?2¨'<¦'!$2$){48#&2$>N2,IJ0(0|!;#12-&IY!$I}^J!$'A%C !$+
)"IL%( ~)"2$)"+L><.Z ¤ R@+L ;!;>?IJ'!$3uI-1%('!$%( &KJ/&%('N#&%( 1Kab2$IY!$/&>?2S+Launhiu]&'c%C'Q!$#&IY!"!;#&2V< +LOv2$>NIY!;2+J
'?2@R$+L &-1IL><'!$+L>NILKL2@ZoW!$#&2$>W!;#&IJ !;#&IP!ykIJ &-~'?+J),2-1%(Bv2$>?2@ &R$2$'Q%( yEG/&2$><O&>N+.R$2$'?'N%( &KJs?3!$#&2$>?2"IL>N2
2N¦R$2$0C0(2$ ;!7IL 1IL0(+JKL'w!$+c`v+Y!;#SR$/&0g!;/&>N2$'?Z]]+P!;#¨m n]o8p9oQqIJ &-¨ntiu]1']IL>?2Q-&2@R$0(IJ>?IY!$%XfD2$Z#&IY!]4u2
&+¢4!;+"`v2A>?2@0(IY!$%(+J &']IL>N2A>?2@ab2$>?>N2$-z!;+"IL'*(O&>N2$-&%(R@IY!;2@'?¦%C ¨m_n]oQp9oQqQZ!$/&O&0(2Q%C']R$IL0C0(2$-¨I"KL>?+J/& &-
abILRV!F%C }m n]oQp9o8qQZ !;IJ`&0(2z%('hR$IL0C0(2$-¨IL C2NG!;2$ &'N%(+J &IL0l-&2$^& 1%X!;%C+L &9%C }m n]oQp9oQqIJ &-'?++J &Zk:iu+
&+P!8KL2N!u`D+JKLKJ2$-}-&+¢48 `;<"!;#&2$'N2u'?Ov2$R@%(^&R$'NG4u2t%C &R$0(/1-&2 !$#&2$)#&2$>N2 /&'!u'?+*!$#&IY!w U R$+J & &2$RN!$%(+J &3%(a| EG/&2$><=>N2$O&>N2$'?2$ ;!;IP!;%C+L "%('?3j+La9R$+J/&>?'N2$3¦18p!$#&IY!w46IJ']%( ;!;>?+.-1/&R$2$-"2$IL>N0(%C2$>?Zw V 6!;#12u>?2$)"IL%C &-&2$> +Ja1!;#&%C' -&+.R$/1),2$ ;!;34u2F48%(0(0 %C !$>?+.-&/&R@2=`1IL'?%CR¨+LOv2$>?IP!;%C+L &'"IL &-~)"IL &%CO&/&0CIY!;%C+L &'*!$#&IY!c4u2¨R$IL ~Ov2$>?ab+J>?)+J >?2@0(IY!$%(+J &'?Z" v+L>Q2$ILRY#'?/1R¡#`&IJ'?%CRc+LOv2$>?IP!;%C+L &3462*48%(0C0'N#&+¢4¢#&+¢4%X!c%('Q>?2$O1>?2$'N2$ !$2$-}%C 2$ILRY#+La|!$#&2 !;#1>?2$2Q-&%CBD2@>?2$ ;!6 &+P!;IP!;%(+J &'?Z Zu¤¨¥l¦¨§v¥©§vª]«¬G¨®&¯@¦¨§v¥l° ±\[]#&2z/& &%C+L ¨+Laj!¤4u+c>?2$0CIY!;%C+L &'7²IL &-=³%('!;#&2Q'?2V!8+Lal2$0(2$)"2$ ;!;'!;#&IP! £ IJ>?2u%C ²+J>\%C ³´+L>]`v+P!;#&Z2uIL'N'?/&)"2w!$#&IY!w!;#&2h'?RY#&2$)"IL'w+La²IJ &-³IL>N2uIL0C% 28k+Ja9R$+L/1>?'?2@s IJ &-z!;#1IY! !;#&2@%(>R$+J0(/&)" &' IL>N2tIJ0('N+A+L>N-&2$>N2$-,IJ0(% 26k+Ja¦R$+J/&>?'N2$3vILKJIL%( 1s?Z2] &+¢4KJ%XfD2!;#12 !$#&>?2$2 >N2$O&>?2@'?2$ ;!;IP!;%(+J &']+La5!;#128/1 &%(+J }>?2@0(IY!$%(+J &± ² ³k'?%C),O10(2$3j>?%(KJ# !$µ s ¶·?¸ ¹?º ¹?»¹¢¼¦½~¾À¿Á ·?¸¹¢º¹¢»¹?¼|½Â ¶·?¸ ¹?º ¹?»¹¢¼¦½~¾À¿Ã ·?¸¹¢º¹¢»¹?¼|½Â Ä*+Y!;%CR$2S!$#&IY!!$#&2SfÀIL>N%(IL`10(2$' IJ>?2)"2$>?2@0X<(O10(ILR@2$#&+L0C-&2$>N'?Q/&'?2@-ab+J>O&IY!N!;2$>N ¸¹¢º¹?» ¹?¼ )"IY!$R¡#1%( &KJ1462uR$+J/&0(-¨#&I¢fD2W48>?%g!?!$2$ F!;#&2QIL`v+¢fD2W!¤46+cIL'N± ¶·?Å ¹?Æ ¹;Ǧ¹¢È¦½~¾À¿Á ·?Ź¢Æ¹ Ǧ¹?È|½Â ¶·?É ¹?Ê ¹?˹¢Ì¦½~¾À¿Ã ·?ɹ¢Ê¹¢Ë¹?Ì|½Â ·?ÃGÍÀÎGÍÀÏG¶ÑÐ Ò¤ÓlÔ\Õ Ö¨ÔØ×_Ù ÚÜÛÝLÔYÚÜÙNÞ?ß9àáÔYÖ9âYãÛ¤ßߨÚäÙâ×_åVæ]×Ûßç×_Ú Þ?èéå ÛVêWÞ?Ô]ë]ê ÚÜìLÛ¤ÖÛVÙVÞ_êÔ@èVã í]Û¤ÙNÞ\ßîÞbæ.ïÜÛàîÔ¡ÖjðQÔ$ê ãïäÛñ¡ò óLó ôGÁ¦õ ö©Á¦½ ÷Gø¦ùÀõ;ø ·?ÃGÍÀÎGÍÀÏG¶ÑÐ ôGÁ¦õ ö©Ã¦½ Zuû¨¦ü_¬Gý ¬À¥lþ.¬"§Dªw«¬G¨®&¯@¦¨§v¥l° ±\[]#&2*-&%(Bv2$>N2$ &R$2t² ³~+Ja1!¤46+A>N2$0(IP!;%C+L &'\²ÿIJ &-³%('!;#&2*'?2N!u+La ú 2$0C2$)"2$ !$'l!$#&IY!AIL>N2A%( =²`&/J!A &+Y! %C =³ Zhu'h/&'?/1IL0(35462QIJ'?'?/1),27!;#&IP!t!$#&2A'NR¡#&2@),IJ']+La²IL &- ³ ³IL>N2uIL0C% 2hIL &-F!$#&IY!w!;#&2$%C> R$+L0C/&), 1'\IL>N2uIL0C'?+c+J>?-&2$>N2$-IJ0(% 2@Z\+L>N2$+¢fD2$>?31 &+Y!$%(R$2w!$#&IY!6² %C'] &+Y!kKJ2$ &2$>NIL0(0g<|s!;#&2Q'NIL)"28IJ'*³ ²*Z ² ³ ¶·?¸ ¹?º ¹?»¹¢¼¦½~¾À¿Á ·?¸¹¢º¹¢»¹?¼|½D¹ø|õ;¶©Ã ·?¸¹¢º¹¢»¹?¼|½Â ·?ÃGÍÀÎGÍÀÏG¶ÑÐ ôGÁ¦õ ö©Á¦½ Í¢¡GÏÀÍ¢£À¶ ·?ÃGÍÀÎGÍÀÏG¶ÑÐ ôGÁ¦õ ö©Ã¦½ Z¦¥Y¥¦¯$¬Gý °$¬Gþ.¯$¦¨§v¥§vª «¬G¨®&¯@¦¨§v¥l° ± []#&2]%( ;!;2$>N'?2$RV!;%(+J F² ³´+La!¤46+Q>?2$0CIY!$%(+L 1'² IJ &-,³´%('j!$#&2]'?2V! ¤ +Ja2$0C2$)"2$ !$'l!$#&IY!8IJ>?2A%C ¨§ © ²IL &-}³ Z]*KLIL%C &31462*IL'?'N/&)"2h!;#1IY!]!;#128'NR¡#12$),IJ']+La²ÑIJ &-³ IJ>?2\IJ0(% 2_IJ &-*!;#1IY!!$#&2$%(>9R@+L0(/1), &'lIL>N2]IL0C'?+h+L>?-12$>?2$-cIJ0(% 2$ZÄu+P!;%(R@29!$#&IY!7² ³² k¨² ³sNZ ² ³ ¶·?¸ ¹?º ¹?»¹¢¼¦½~¾À¿Á ·?¸¹¢º¹¢»¹?¼|½D¹Ã ·?¸ ¹?º¹¢»¹¢¼D½Â ·?ÃGÍÀÎGÍÀÏG¶ÑÐ ôGÁ¦õ ö©Á¦½ ù;øG¶ÀÍGÁÀÃGÍGÏÀ¶ ·?ÃGÍÀÎGÍÀÏG¶ÑÐ ôGÁ¦õ ö©Ã¦½ Z¦Qý § ¬Àþ.¯$¦§v¥±*o8Ov2$>NIY!$2$'u+L yI'N%( &KJ0(2A>N2$0(IP!;%C+L }IJ &->?2@),+¢fD2$'h'?+J),2z+La|!;#&2cR@+L0(/1), &'NZu'N2$ab/&0 ab+J>,>N2$'!$>?%(RV!;%( 1K}%( 1ab+L>?)"IY!$%(+J &Z©u'?'N/&)"2c!$#&IY!"462F4uIL ;!=+L &0g<!;#&2¨ &IJ),2¨IJ &-IJ-&-&>?2@'?'"ab>?+L) >N2$0(IP!;%(+J =²*Z ² ! #"$"&%')(*( ¶·?¸ ¹?º|½©¾G¿Á ·?¸ ¹?º¹¢»¹¢¼D½Â []#G/&' `D2@R$+L)"2]%(>N>?2$0C2NfÀIL ;!WIP!?!;>N%(`&/!;2$'NZl2]R$+L/&0C-,%C &'!$2$IL-c>N2$%( 1ab+L>?R@2!;#&%C'` »¹¢¼ ¶·?¸ ¹?º|½©¾G¿Á ·?¸ ¹?º¹,+¦¹,+.½Â ÃGÍGÎÀÍGÏÀ¶.-GÅ¢/ÀɹÅGÈÀÈ¢0GÉ21¢1 ôGÁ¦õ ö©Á Z¦4l¬À¬Gþ.¯$¦¨§v¥±\oQOv2$>?IP!;2$'h+L SIc'?%C &KL0C28>N2$0(IP!;%C+L ¨IL &-S>?2$)"+¢fD2$']'?+J),2Q+Jaj!;#&2Q>N+ 48'NZt[]#12A>?2@),+¢fÀIL0 3 %C'8`1IL'?2@-+L 'N+L)"2,R@+L &-&%g!;%C+L y'?Ov2$R$%C^&2$-y` <!$#&2,/1'?2$>NZ D+J>Q2N|IJ),O10(2$3 '?/1O&Ov+L'?28462646IL ;!IJ0(0 !$#&2W!;/&O&0C2$']ab>?+J)¢² 48#12$>?2W!;#128 1IL)"28%C'](%CR¡#&IJ2$0(CZ ó65 ² 7 8:9<;>= ¶·?¸ ¹?º ¹?»¹¢¼¦½~¾À¿Á ·?¸¹¢º¹¢»¹?¼|½D¹¸FEHG¢öJIGÇ ÌGÅÀÉKJL¦Â ÃGÍGÎÀÍGÏÀ¶Ð ôGÁ¦õ ö©Á M¢NGÍÀÁGÍ.-ÀÅ¢/GÉ.EOG?ö2IGÇ;ÌÀÅGÉ2KJL Z¦Qÿ§vý ¬R4l¬À¬Gþ.¯$¦¨§v¥l°TSH&2$0C2$RN!;%C+L &'c`v2$R@+L)"2}R$+J),O&0C%(R$IP!;2$-48#&2@ y!;#12}R$+J &-&%X!$%(+J &',KJ2N!}0(+L 1KL2$>N3 P O&IJ>!$%(R$/&0CIL>N0X<¨48%X!$#}iuIP!;IL0C+LKkç!$#&2A+P!;#&2@>w!¤46+ab+J>?)"'tIJ>?2cO&>N2N!?!¤<'!$>?IL%CKL#;!;ab+J>46IJ>?-&sNZVU]+L &'N%(-&2@> ab+J>_2V|IJ),O&0C2$3;48#&2$ *4624uIL ;!7IL0C0J!;#12!;/&O&0C2$'ab>N+L)dn´48#12$>?2!$#&2] &IL)"2]%('(%(RY#&IJ2$0(.oQny48#12$ !$#&2QKL2$ &-12$>]%(']C(ZG2W48>?%g!;2w!$#&%(']%C =i*IY!;IJ0(+JKIL'N± ¶·¢¸¹?º ¹?» ¹?¼¦½¾G¿Á ·?¸ ¹?º¹¢»¹¢¼D½¦¹,¸FEHG?ö2IGÇ ÌGÅGÉ2KJLD ¶·¢¸¹?º ¹?» ¹?¼¦½¾G¿Á ·?¸ ¹?º¹¢»¹¢¼D½¦¹,»FEHG?öWL¦Â Äu+P!;%CR$2h!$#&IY!Q462cC'?O&0C%X!$j!;#&2cR@+L &-&%g!;%C+L ILR@>?+L'N'h!¤46+>?/&0C2$'?3 /1'!,IL'W462"-&+¨%( !$#&2c/& &%C+L yR$IL'N2 U kXU]+L)"2z!;+ !;#&%C +Lah%X!$3l!;#12,oQn%('A%C &-&2$2@-¨!$#&2,/1 &%(+J +Ja!¤46+}R@+L &-&%g!;%C+L &'Ns?Zy&%()"%(0CIL>?0g<3j!$#&2 R$+J),)"I%C }2$IJR¡#S+La|!;#&2cIJ`v+ fD2c-1IY!;IJ0(+JK=>N/&0(2@'t)"+.-&2$0C'w!;#&2c*ÄuidR$+J &-&%g!;%(+J &Zcp92N!$('hR$+L &'N%(-&2@> I)"+L>N2AR$+J),O10(%(R@IY!;2@-=R@+L &-&%g!;%C+L &±Q&2$0C2$RN!8IJ0(05!;#&27!;/&O&0C2$'hab>?+L)xn !;#&IP! IL>N2A &2$%g!;#&2@>t)"IL0C2A &+J> #&I¢fD2w!$#&2Q &IL)"2Q( v+¢|CZ ¶·¢¸¹?º ¹?» ¹?¼¦½¾G¿Á ·?¸ ¹?º¹¢»¹¢¼D½¦¹,» ¾¢YZG?öWLG¹=¸d¾FY[G¢ôJ\;¸WL¦Â #&%C0(2Q'?2@0(2$RN!$%( &K*!$#&2h!$/&O&0C2$']ab>?+J) n!$#&IY! IJ>?2z &+Y! `v+P!;#)"IL0(2QIJ &-#&I¢fD2W!;#&2z &IL)"2Q( v+¢|9%C' IJR¡#&%C2NfD2$-"` <kç48#;<|µ s?± ¶·¢¸¹?º ¹?» ¹?¼¦½¾G¿Á ·?¸ ¹?º¹¢»¹¢¼D½¦¹,» ¾¢YZG?öWL¦Â ¶·¢¸¹?º ¹?» ¹?¼¦½¾G¿Á ·?¸ ¹?º¹¢»¹¢¼D½¦¹,¸ ¾¢YZG?ô2\;¸HL| Z¦]c®&ý ¯$¬G°$¦®&¥^Qý§:_ `lþ.¯ ±c[]#&%('z%('w!$#&2"'?2N!,+Law(O&IJ%(>N'? ab+L>N),2$-y`;<RY#&+.+L'N%( &K !;#&2"^&>?':!,2$0(2$)"2$ ;! ó ab>N+L)¢²ÑIL 1-"!;#12u'?2$R@+L &-¨2$0C2$),2@ !tab>?+J)¢³Z] ¤ R$IL'N2Q+LalR$+L &ab/1'?%(+J &'hIL)"+L &K"IY!N!;>?%C`&/J!$2u &IL)"2$'N3 -&%C'?IJ)c`&%(KJ/&IY!$2h!;#12$)x` <O&>N2$^J¦%( &K6!;#&2@)48%g!;# !;#&2c>N2$0(IP!;%(+J } &IJ),2@Z8[]#12cR$IL>:!;2$'N%(IL SO&>?+.-1/&RN! +Jav!¤46+">?2@0(IY!$%(+J &'7²ÑIL &-}³ %('hKL%gfD2$ `;<|± ² ³ ¶·?¸2a¦¹¢ºJa¦¹¢»Ja|¹V¼JaD¹?¸Fb¹?ºFb¹?»¢b¹¢¼ b¦½~¾À¿Á ·?¸Ja|¹?º2a¦¹?»2a¦¹V¼2a.½¦¹,à ·?¸¢b ¹?ºFb¹?»Fb¹V¼Fb¦½Â ÃGÍGÎÀÍGÏÀ¶©ÁhÂ*-ÀÅ¢/ÀɹÁhÂÅÀÈGÈF0GÉJ1F1D¹ÁtÂËÀÉ¢-GÈÀÉ¢0 ¹,Áh¨ÆJIc0FdGÌÀÈGÅ¢d.É ¹ ÃhÂ*-ÀÅ¢/ÀɹÃhÂÅÀÈGÈF0GÉJ1F1D¹ÃtÂËÀÉ¢-GÈÀÉ¢0 ¹,Ãh¨ÆJIc0FdGÌÀÈGÅ¢d.É ôGÁ¦õ ö©Á¹¢Ã Äu+P!;%CR$2Q#&+¢4462z-&%('NIL)c`&%CKL/&IP!;2AIP!?!$>?%(`1/J!;2$'h%( !;#&2Q&QpfD2$>N'?%(+J &Zuu0C'?+" &+Y!$%(R$27!;#&IP!F%Ca²r#1IL' ³H48%C0(0¦#1I fD2 f!;/1O&0(2$'NZ !;/&O&0C2$']IL 1-³ #1IL'Wf!;/&O&0C2$'?31!$#&2$ =² e e ó6g Z¦hji¬G¯$®lk#m¦§v¦¨¥±¨[]#&%('c%C' /&'!=0C% 2*!$#&2¨R$IL>:!;2$'N%(IL ~O&>N+.-&/&RN!=`1/J!=KL+.2$'"I'!$2$O~ab/&>!$#&2$>NZ*aä!;2$> 5 U ab+J>?)"%( &K*!$#&2 f!;/&O10(2$'N3¦%X!u'?2$0C2$RN!$'\+L 10X e +J 'N+L)"2R$+L &-1%X!;%C+L &Z[]#G/&'N3h!;#12!;#&2N!$IYT +L%C ©+Lah!¤4u+>?2$0CIY!$%(+L 1'#&IL'8!;#&2'NIL)"2 À/&)c`v2$>+La U R$+J0(/&)" &'hIL'!;#&2zR$IL>:!;2$'N%(IJ }O&>N+.-&/&RN!A`&/J!A &+Y!c &2$R$2$'N'?IJ>?%(0g<!;#128'NIL)"2A G/&)c`v2$>]+La >N+ 48'Ak'N+L)"2 +Ja]!$#&2}>N+¢48'Q48%C0(0h`D2S>?2$)"+¢fD2$-~`v2$R$IL/1'?2!;#12N< -&%C'?'?IP!;%C'?aä<Ñ'N+L)"2}R$+J &-&%g!;%(+J &s?Z^U]+J &'?%C-&2$>ab+J> 2N¦IL)"O&0C2$3 !;#&IP!]462W46IJ !w!;+c^& &-¨CO&IL%C>?'?9+Ja':!;/&-12$ !$'\'N/&R¡#F!$#&IY!]!$#&2Q^&>?':!8OD2@>?'?+J ¨%( F!;#&2*O&IL%C> %C']IL0X4uI¢<|']C%(RY#&IL2$0C(Z|28KJ2N!;± ²on)p6qsr ³ t#u#8lv ;>= ¶·?¸2a¦¹¢ºJa¦¹¢»Ja|¹V¼JaD¹?¸Fb¹?ºFb¹?»¢b¹¢¼ b¦½~¾À¿Á ·?¸Ja|¹?º2a¦¹?»2a¦¹V¼2a.½¦¹,à ·?¸¢b ¹?ºFb¹?»Fb¹V¼Fb¦½D¹,¸2acEHG¢öJIÀÇ;ÌGÅÀÉJKL| ÃGÍGÎÀÍGÏÀ¶©ÁhÂ*-ÀÅ¢/ÀɹÁhÂÅÀÈGÈF0GÉJ1F1D¹ÁtÂËÀÉ¢-GÈÀÉ¢0 ¹,Áh¨ÆJIc0FdGÌÀÈGÅ¢d.É ¹ ÃhÂ*-ÀÅ¢/ÀɹÃhÂÅÀÈGÈF0GÉJ1F1D¹ÃtÂËÀÉ¢-GÈÀÉ¢0 ¹,Ãh¨ÆJIc0FdGÌÀÈGÅ¢d.É ôGÁ¦õ ö©Á¹¢Ã M¢NGÍÀÁGÍ©ÁtÂ*-GÅF/GÉwExG?öJIÀÇ;ÌÀÅGÉJK2L Äu+P!;%CR$2|!;#&IP!4u2_%C !$>?+.-&/1R$2l!$#&2 (`v+¢4]!;%(2@.'<¦)c`D+J0yn*pz'?/lz6ÀT?2$-z` %C &-&%CR$IY!$%( &K !$#&2W!;#&2N!$IYT +L%C &Z]u0('N+L39>N2$IL0C%|{$2w!$#&IY! U ²}n*p~´³ ~]k² ³s 7 Z¦S®&¯`lý®1m¦§v¦¥±S[]#&%('z%(' /&':!IR$0C2NfD2$>N2$>w46I¢<!$+R$+L)c`1%( &26!¤46+}>N2$0(IP!;%C+L &'c%C !$+}+L 12$Z~[]#&2 g U `&IJ'?%CR,%C-&2$I¨%('W!;#1IY!%Ca|!;#&26!¤46+¨>?2$0CIY!$%(+L 1'8#1I fD2"'?+J),2"R$+J0(/&)" k'NsQ%( yR$+L)"),+J &3|!;#&2@ ¨4u2,R$IJ CR$+L0C0(ILO1'?2$j!$#&2$) %C !$+"!;#12,'NIL)"2R$+L0C/&)" %C ¨!$#&2,^1 &IL0w+L/J!$O&/J!;Zy+L>N2$+¢fD2$>?3|4u2,R$IJ -&+ !;#&%C' +J &0X<}%(a!;#&2w!¤4u+h!;/1O&0(2$' ab>N+L)Ñ!;#&2w!¤4u+8>N2$0(IP!;%C+L &'wILKL>N2$2u%C z!;#&+J'?2uR@+L)"),+J R$+L0C/&)" &'?Zw[]#À/&'N3¦%g! %C'u'?%C),%C0(IL>!$+z!;#12cR$IL>:!;2$'N%(IL O1>?+.-&/&RV!;3`&/!u462z +L%C &l+L &0g<}!;#1+L'?2cO1IL%(>N'w!;#1IY! )"IY!$R¡#S%( !;#12$%(> U R$+J),)"+L "IY!N!;>N%(`&/J!$2$'?ZU]+L &'N%(-&2@>l!;#1IY! 46246IJ ! !;+Q^& 1-z!;#&2h &IJ),2$3vIJ-&-&>?2@'?'?3vKJ2$ &-&2$>N3vKLO&IzIL &- `&%C>!$#&-&IY!$2A+Ja'!$/&-&2$ ;!;'h%( ¨I'N%( &KJ0(2Q>?2$0CIY!$%(+L 1ZuÄu+Y!$%(R$2_!;#&IP! KLO&I"%('hI¢fÀIL%(0CIL`&0C2Qab>?+L)`&/!t!$#&2 +P!;#&2$> ab+J/&>IY!N!;>N%(`&/J!$2$' IL>N2]O&>?2$'N2$ ;!7%( ²*Z]1+L3 462] &2@2$-cIW46I¢ !¤46+c>N2$0(IP!;%C+L &'N± ²on)p ¶·?¸ ¹?º ¹?»¹!¹¢¼D½~¾G¿~Á·?¸ ¹?º ¹?»¹¢¼¦½|¹,η¢¸¹?º ¹|½Â ÃGÍGÎÀÍGÏÀ¶©ÁhÂ*-ÀÅ¢/ÀɹÁhÂ"ÅÀÈGÈF0GÉJ1F1¦¹,ÁtÂËÀÉ¢-GÈÀÉ¢0 ¹ÎtÂË¢ÀŹÁh¨ÆJIc0FdGÌÀÈGÅ¢d.É ôGÁ¦õ ö©Á¹¢Î M¢NGÍÀÁGÍ©ÁtÂ*-GÅF/GÉFEGÎhÂ)-.Å¢/ÀÉ ø¢Áh¨ÅGÈGÈF0GÉ21¢1EÎhÂÅÀÈGÈF0GÉJ1F1 Z8«¬G¥l®l¦¥uj±\[]#1%('h%(' /&':!FIcR$+.+J0v!$#&%( 1KL39%( ¨R$IJ'?2W462Q#&I¢fD2W!;+.+")"IL ; U £ R$+J &ab/&'N%(+L 1'uIL>N%('?%C &KLZ72cR@IL }/1'?2*!;#1%('h+LOv2$>?IP!;+J>]!$+>?2@ &IL)"2AI>N2$0(IP!;%C+L &C't &IJ),2zIL &-1ML+L>Q+J &2 +J>t)"+L>N28+Ja %X!;']IP!?!$>?%(`1/J!;2$'NZu v+L>]2N¦IL)"O&0C2$39IL'N'?/&)"2h4u2]4uIL ;!t!$+,>N2$ &IJ),28² !$+ IJ &-})"I 2 %g!;'hR$+L0C/&), 1'!;+`v2AR@IL0(0C2$- 3 IL 1- ZQ[]#&%C't%C't)"+L':!F/&'N2$ab/&0548%X!$#=>N2$0(IP!;%C+L &IJ0IJ0(KL2@`&>?IJ3 ú ¤ £ e e e 0C% 2h'?+L± k¨²*s ;& # st 5 CS 5614: Misc. SQL Stuff, Safety in Queries 81 What is still to be covered (and will be) Declaring constraints Domain Constraints Referential Integrity (Foreign Keys) More SQL Stuff Subqueries Aggregation SQL Peculiarities Strange Phenomena More on Bag Semantics Ifs and Buts Embedding SQL in a Programming Environment Accessing DBs from within a PL (will be covered in Module 3) CS 5614: Misc. SQL Stuff, Safety in Queries 82 What will be mentioned (but not covered in detail) Triggers Read Cow Book or Boat Book More SQL Gory Details Recursive Queries (SQL3) Why do we need these? Security Authorization and Privacy Trends towards Object Oriented DBMSs CS 5614: Misc. SQL Stuff, Safety in Queries 83 Tuple-Based Domain Constraints Already Seen NOT NULL UNIQUE, PRIMARY KEY etc. In General CREATE TABLE Students (sid CHAR(9), name VARCHAR(20), login CHAR(8), age INTEGER, gpa REAL, CHECK (gpa >= 0.0) ); Note: Implementations vary, but this is the general idea Other Complicated Forms Constraints on whole relations, Assertions CS 5614: Misc. SQL Stuff, Safety in Queries 84 Referential Integrity Constraints Foreign Keys An attribute a of R1 is a foreign key if it “references” the primary key (say b) of another relation R2 In addition, there is a ref. integrity constraint from R1 to R2. Example login is a FOREIGN KEY for Students CREATE TABLE Students (sid CHAR(9) PRIMARY KEY, name VARCHAR(20), login CHAR(8) REFERENCES Accounts(acct), age INTEGER, gpa REAL ); CREATE TABLE Accounts ( acct CHAR(8) PRIMARY KEY ); CS 5614: Misc. SQL Stuff, Safety in Queries 85 Alternatively Can use “FOREIGN KEY” construct CREATE TABLE Students (sid CHAR(9) PRIMARY KEY, name VARCHAR(20), login CHAR(8), age INTEGER, gpa REAL, FOREIGN KEY login REFERENCES Accounts(acct) ); CREATE TABLE Accounts ( acct CHAR(8) PRIMARY KEY ); Note: acct should be declared as PRIMARY KEY for Accounts! in both cases CS 5614: Misc. SQL Stuff, Safety in Queries 86 SQL Subqueries Given Students(sid,name,login,age,gpa) HasCar(sid,carname) Find the car of the student with login=”mark” Traditional Way SELECT carname FROM Students, HasCar WHERE Students.login=’mark’ AND Students.sid=HasCar.sid; The ‘Subway’ SELECT carname FROM HasCar WHERE sid= (SELECT sid FROM Students WHERE login=’mark’); CS 5614: Misc. SQL Stuff, Safety in Queries 87 Aggregation Given Students(sid,name,login,age,gpa) Find the average of the ages of all the students Solution SELECT AVG(age) FROM Students; Other Operations SUM (summation of all the values in a column) MIN (least value) MAX (highest value) COUNT (the number of values), e.g. SELECT COUNT(*) FROM Students; COUNTs the number of Students! CS 5614: Misc. SQL Stuff, Safety in Queries 88 Ordering Given Students(sid,name,login,age,gpa) List the students in (ascending) alphabetical order of name Solution SELECT * FROM Students ORDER BY name; and that for DESCending ORDER is SELECT * FROM Students ORDER BY name DESC; Default is ASC CS 5614: Misc. SQL Stuff, Safety in Queries 89 Grouping Given Students(sid,name,login,age,gpa) Find the names of students with gpa=4.0 and group people with like ages together Solution SELECT name FROM Students WHERE gpa=4.0 GROUP BY name; CS 5614: Misc. SQL Stuff, Safety in Queries 90 More on Grouping Given Students(sid,name,login,age,gpa) Find the names of students with gpa=4.0 and group people with like ages together and show only those groups that have more than 2 students in it Solution SELECT name FROM Students WHERE gpa=4.0 GROUP BY name HAVING COUNT(*) > 2; CS 5614: Misc. SQL Stuff, Safety in Queries 91 Summary of SQL Syntax General Form SELECT Order of Execution FROM WHERE GROUP BY HAVING SELECT CS 5614: Misc. SQL Stuff, Safety in Queries 92 Views Can be viewed as temporary relations do not exist physically BUT can be queried and modified (sometimes) just like normal relations Example: CREATE VIEW GoodStudents(id,name) AS SELECT sid,name FROM Students WHERE gpa=4.0; SELECT * FROM GoodStudents WHERE name=’Mark’; Can we update the original relation using the GoodStudents VIEW? CS 5614: Misc. SQL Stuff, Safety in Queries 93 Beginning of Wierd Stuff SQL uses Bag Semantics meaning: does not normally eliminate duplicates e.g. the SELECT clause BUT (a big BUT) this doesn’t apply to UNION, INTERSECT and DIFFERENCE Either way, it provides facilities to do whatever we want If you want duplicates eliminated in SELECT clause use SELECT DISTINCT ..... If you want to prevent elimination of duplicates in UNION etc. use (SELECT ...) UNION ALL (SELECT ...) Likewise for INTERSECT and DIFFERENCE CS 5614: Misc. SQL Stuff, Safety in Queries 94 ... and that’s just the tip of the iceberg What happens with the following code? SELECT R.A FROM R,S,T WHERE R.A = S.A or R.A = T.A when R(A) has {2,3}, S(A) has {3,4} and T(A) is {} Confusion Reigns! CS 5614: Misc. SQL Stuff, Safety in Queries 95 Safety in Queries Some queries are inherently “unsafe” should not be permitted in DB access Example Given only the following relation Students(id) Find all those who are not students Easy to distinguish unsafe queries via common-sense Final result is not closed Is there an automatic way to determine “safety”? CS 5614: Misc. SQL Stuff, Safety in Queries 96 Answer: Yes! Easiest to spot when written in Datalog Answer(id) <- NOTStudents(id). Golden Rule Any variable that appears anywhere must also appear in a non-negated body part In this case, id causes the query to be unsafe Example of a Safe Query Answer(id) <- People(id), NOT Students(id) This produces all those people who are NOT students safe because the People relation provides a reference point id which appears in a negated body part also appears non-negated CS 5614: Misc. SQL Stuff, Safety in Queries 97 More Dangers Problem not restricted to negated body parts occurs even with arithmetic body parts (why?) Given only the following relation Students(id,age) Find all those numbers that are greater than the age of some student Answer(x) <- Student(id,age), x>age. Extension to previous rule Any variable that appears anywhere must also appear in a non-negated, non-arithmetic body part In this case, x causes the query to be unsafe bcoz it doesn’t appear in a non-negated, non-arithmetic part CS 5614: Misc. SQL Stuff, Safety in Queries 98 One More Example Given a relation Composite(x) which lists all the composite numbers Write a query to find the prime numbers Wrong Way Prime(x) <- NOT Composite(x). Right Way Prime(x) <- Number(x), NOT Composite(x). Safety in Other Notations Relational Algebra: via the subtraction operator SQL: via the EXCEPT construct Notice how SQL and Relational Algebra do not allow unsafe queries because there is no way to write such queries with the given constructs how clever, eh? :-) It is always amazing how “languages” force you to think in a certain manner a problem long studied by philosophers CS 5614: Misc. SQL Stuff, Safety in Queries 99 Recursion in Queries Used to specify an indefinite number of “applications” of a relation Example Given only the following relation Person(name,parent) Find all the ancestors of “Mark” Easy to find an ancestor at a predefined level parent: Use Person grandparent: Join Person with Person great-grandparent: Join Person with Person with Person and so on. To find an ancestor at no predefined level Need to join Person with Person an “indefinite” number of times SQL3 provides support for recursive definitions CS 5614: Misc. SQL Stuff, Safety in Queries 100 Solution in Datalog First, the base case Ancestor(x,y) <- Person(x,y). Then, the inductive step Ancestor(x,y) <- Person(x,z), Ancestor(z,y). Can also write the previous rule as Ancestor(x,y) <- Ancestor(x,z), Ancestor(z,y). why? CS 5614: Misc. SQL Stuff, Safety in Queries 101 Recursion in SQL3 Use the “WITH RECURSIVE ... SELECT” construct Example WITH RECURSIVE Ancestor(name,ans) AS (SELECT * FROM Person) UNION (SELECT Person.name,Ancestor.ans FROM Person, Ancestor WHERE Person.parent=Ancestor.name) SELECT * FROM Ancestor; Use with caution: Some kinds of recursive queries will not be allowed! example: the following Datalog query might not be allowed in SQL3 Ancestor(x,y) <- Ancestor(x,z), Ancestor(z,y). because the rule involves 2 applications of the recursively defined predicate “Linear recursion” allows only one (as in the SQL code above) CS 5614: Misc. SQL Stuff, Safety in Queries 102 Final Example Be careful when combining negation, aggregation and recursion perfect recipe for disaster! Mutual Recursion Odd(x) <- Number(x), NOT Even(x). Even(x) <- Number(x), NOT Odd(x). What are the problems? Notice that the query appears “safe” (per Slide 96) cycles indefinitely!; no proper base cases Illegal in SQL3 not because of mutual recursion but due to the fact that there is no “unique interpretation” to the query Eg: 6 could be either in Odd or in Even; both are acceptable! Sometimes mutual recursion is good and fruitful, if written properly with proper limiting constraints and base cases CS 5614: Misc. SQL Stuff, Safety in Queries 103 Introduction to Deductive DBMSs Intersection of traditional RDBMSs and Logic Programming Example Systems CORAL (Univ. Wisc.) LDL++ (MCC) XSB Systems (SUNY, Stony Brook) Can be viewed as extending PROLOG-type systems with secondary storage extending RDBMSs with deductive functionality Mappings: Commonalities between PROLOG and DBMSs Predicate: Relation Argument: Attribute Ground Fact: Tuple Extensional Definition: Table (defined by data) Intensional Definition: Table (defined by a view) CS 5614: Misc. SQL Stuff, Safety in Queries 104 PROLOG vs. RDBMSs Characteristics of PROLOG Tuple-at-a-time Backward Chaining Top-Down Goal-Oriented Fixed-Evaluation Strategy (Depth-First) Characteristics of RDBMSs Set-at-a-time (recall relational algebra) Forward Chaining Bottom-Up Query Optimizer figures a good evaluation strategy Example ancestor(X,X). parent(amy,bob). ancestor(X,Y) <- parent(X,Z), ancestor(Z,Y). Query Find the ancestors of bob: ancestor(X,bob)? CS 5614: Misc. SQL Stuff, Safety in Queries 105 PROLOG Pitfalls Previous Example Linear Recursion Tail Recursion What if we reverse the order of clauses in ancestor(X,Y) <- parent(X,Z), ancestor(Z,Y). PROLOG goes into an infinite loop (why?) What if we make it ancestor(X,Y) <- ancestor(X,Z), ancestor(Z,Y). “Not Linear” Recursion Inference = Resolution + Unification Entailment in First Order Logic is Semi-decidable CS 5614: Misc. SQL Stuff, Safety in Queries 106 Example of Deductive Query Optimization Same-Generation: Hello World of DDBMSs sg(X,Y) <- flat(X,Y). sg(X,Y) <- up(X,U),sg(U,V),down(V,Y). Magic: A Rewriting Technique Rewrite query such that advantages of bottom-up evaluation goal-oriented behavior are combined Example: For the query sg(john,Z)? Magic produces sg(X,Y) <- magic_sg(X),flat(X,Y). sg(X,Y) <- magic_sg(X),up(X,U),sg(U,V),down(V,Y). magic_sg(john). How do you know when to stop? Iterative Fixpoint Evaluation (when the answer stops changing) CS 5614: Misc. SQL Stuff, Safety in Queries 107 SQL in a Programming Environment Incorporating SQL in a complete application Why? There are some things we cannot do with SQL alone e.g. preserving complex states, looping, branching etc. Typically embed SQL in a host-language interface Problems: Impedance Mismatch SQL operates on sets of tuples Languages such as C, C++ operate on an individual basis Solution easy when SELECT returns only one row When more than one row is returned design an iterator to “run” over the results called a “cursor” CS 5614: Misc. SQL Stuff, Safety in Queries 108 How are these implemented? Vendor-Specific Implementations ORACLE: PL/SQL (procedural extensions to SQL) Open Database Connectivity Standard Provides a standard API for transparent database access used when “database independence” is important used when required to “connect” to diverse data sources CS 5614: Misc. SQL Stuff, Safety in Queries 109 Tradeoffs ODBC originated by Microsoft in 1991 adds one more abstraction layer not as fast as a native API (does not exploit “special features”) least-common denominator approach constantly evolving PL/SQL etc. “tailored” to the details of the underlying DBMS might not extend to heterogeneous domains modeled after a specific programming language (e.g. Ada for Pl/SQL) CS 5614: Misc. SQL Stuff, Safety in Queries 110 In Between: Stored Procedures Used for developing “tightly-coupled” applications ”push computations” selectively into the database system avoid performance degradation work in database address space instead of application address space Advantages No sending SQL statements to and fro eliminate pre-processing speedup by an order of magnitude Example Applications Database Adminstration Integrity Maintenance and Checks Database Mining Disadvantages Non-standard implementation Difficult to enforce transactional synchronization Without traditional SQL optimization, can lead to performance degradation CS 5614: Misc. SQL Stuff, Safety in Queries 111 Introduction to Query Optimization Helps attain declarativeness of RDBMSs One of the main reasons for commercial success of DBMSs A motivating example Find all students with 4.0 gpa enrolled in CS5614 SELECT name FROM Students, Classroll WHERE Students.name = Classroll.studentname AND Students.gpa = 4.0 AND Classroll.coursename = ‘CS5614’ Two Strategies Do join and then filter out the ones with gpa <> 4.0 and course <> CS5614 Filter first the ones with gpa <> 4.0 and course <> CS5614 and then Join Which is Better? Always good to “push selections” as far down into the query parse tree CS 5614: Misc. SQL Stuff, Safety in Queries 112 How does a Query Optimizer work? Three Requirements A Search Space of “Plans” A Cost Model (for Plan evaluation) An Enumeration Algorithm Ideally Search Space: contains both good and efficient plans Cost Models: cheap to compute and accurate Enumeration Algorithm: efficient (not a monkey-typewriter algorithm) Example of a Search Space See Previous Slide Examples of Cost Models #(tuples) evaluation #(main memory locations) etc. Example of an enumeration algorithm Sequential enumeration of a lattice of plans Dynamic Programming vs. Greedy Approaches CS 5614: Misc. SQL Stuff, Safety in Queries 113 A Simple Measure of “Cost” #(Tuples) in a query Easiest to compute for Cartesian Product: #(R X S) = #(R)#(S) Projection:#(Pi(R)) = #(R) A Notation for Other Operations V(R,A) = Number of distinct values of attribute “A” in R formulas assume that all values of “A” are equally likely in R Holds in average case for most distributions (e.g. Zipf) Selectivity Factors for Selection Operations Equality Tests: Use 1/V(R,A) < or > Tests: Use 1/3 “<>” Test: Use (V(R,A)-1)/V(R,A) AND Conditions: Multiply Selectivity Factors OR Conditions: Three Choices Sum of results from individual selectivity factors Max(sum,total size of relation): why? n(1-(1-m1/n)(1-m2/n)) formula : most accurate CS 5614: Misc. SQL Stuff, Safety in Queries 114 Estimating the Size of a Join Assume: R(X,Y) Join S(Y,Z) Range of Values Minimum: 0 In-between: #(R) (if Y is a foreign key for R and a key for S) Maximum: #(R)#(S) (if Y’s in R and S are all the same) Assumptions for Join Size Estimation Containment of Value Sets Preservation of Value Sets Containment of Value Sets If V(R,Y) <= V(S,Y) then the Y’s in R are a subset of the Y’s in S Satisfied when Y is a foreign key in R and a key in S Preservation of Value Sets #(R Join S,X) = #(R,X) #(R Join S,Z) = #(S,Z) why is this reasonable? CS 5614: Misc. SQL Stuff, Safety in Queries 115 The Actual Estimate Assume that V(R,Y) <= V(S,Y) Every tuple in R has a chance of 1/V(S,Y) of joining with a tuple of S Every tuple in R has a chance of #(S)/V(S,Y) of joining with S All tuples in R have a chance of #(R)#(S)/V(S,Y) of joining with S What if V(S,Y) <= V(R,Y) Answer: #(R)#(S)/V(R,Y) In general: #(R)#(S)/(max (V(S,Y),V(R,Y))) What if there are multiple join attributes Have a “max” factor in the denominator for each such attribute! How to Estimate #(R Join S Join T)? Does it matter which we do first? Surprise! Estimation formula preserves associativity of Joins! In other words, “it takes care of itself!” Thus, for a Join attribute appearing > 2 times 3 times: Use two highest values 4 times: Use three highest values etc. CS 5614: Misc. SQL Stuff, Safety in Queries 116 More on Join Associativity and Commutativity Which is better: (R Join S) or (S Join R) Good to put the smaller relation on the left Why? Most Join algorithms are assymmetric Example: Construct a “good query tree” for the following SELECT movietitle FROM Actors,ActedIn WHERE Actors.name = ActedIn.actorname AND Actors.age = 23 Number of Possible Trees of n Attributes Arises from the shape of the trees: T(n) Arises from permuting the leaves: n! Total choices: n! T(n) CS 5614: Misc. SQL Stuff, Safety in Queries 117 What is T(n)? Sample Values 1: 1 2: 1 3: 2 4: 5 5: 14 A formula T(1) = 1 (Basis) T(n) = T(1)T(n-1) + T(2)T(n-2) + ..... + T(n-1)T(1) Classifications Left-Deep Trees: All right children are leaves Right-Deep Trees: All left children are leaves Bushy Trees: Neither Left-Deep nor Right-Deep Choosing a Join Order: Restricted to Left-Deep Trees By Dynamic Programming: O(n!) Greedy Approach: Make local selections CS 5614: Misc. SQL Stuff, Safety in Queries 118 Example Consider R(a,b): #(R) = 1000, V(R,a) = 100, V(R,b) = 200 S(b,c): #(S) = 1000, V(S,b) = 100, V(S,c) = 500 T(c,d): #(T) = 1000, V(T,c) = 20, V(T,d) = 50 Possible Join Orders (R Join S) Join T (S Join R) Join T (same as above; why?) (R Join T) Join S (T Join R) Join S (same as above) (S Join T) Join R (T Join S) Join R (same as above) Cost Estimation = Sizes of Intermediate Relations (R Join S) Join T: 5000 (R Join T) Join S: 1000000 (S Join T) Join R: 2000 Best Plan = (S Join T) Join R CS 5614: Misc. SQL Stuff, Safety in Queries 119 Database Tuning: Why? Two Families of Queries OLTP (Access small number of records) OLAP (Summarize from a large number of records) Sources of Poor Performance Imprecise Data Searches Random vs. Sequential Disk Accesses Short Bursts of Database Interaction Delays due to Multiple Transactions What can be done? Tune Hardware Architecture Tune OS Tune Data Structures and Indices CS 5614: Misc. SQL Stuff, Safety in Queries 120 Examples of Database Tuning To normalize or not to Sacrificing Redundancy Elimination Sacrificing Dependency Elimination Several Choices of Normalized Schemas Vertical Partitioning Applications Recomputing Indices Histograms etc. might be outdated Restricting Uses of Subqueries Unnesting query blocks by Joins Declining the Use of Indices Table Scans for Small Tables Rule-based optimization: Rewrite A=6 as “A+0=6” Provide Redundant Tables Decision-Support/ Data Mining Queries