Quick viewing(Text Mode)

Outline NULLS in SQL Null Values Null Values Null Values

Outline NULLS in SQL Null Values Null Values Null Values

Outline

Introduction to Data Management • Nulls (6.1.6 - 6.1.7) CSE 344 • Outer joins (6.3.8) • Aggregations (6.4.3 – 6.4.6) • Examples, examples, examples… Lectures 4 and 5: Aggregates in SQL

Magda Balazinska - CSE 344, Fall 2011 Magda Balazinska - CSE 344, Fall 2011

NULLS in SQL Null Values

• Whenever we don’t have a value, we can put a NULL • If x= NULL then 4*(3-x)/7 is still NULL • Can mean many things: – Value does not exists – Value exists but is unknown • If x= NULL then x=‘Joe’ is UNKNOWN – Value not applicable • In SQL there are three boolean values: – Etc. FALSE = 0 • The schema specifies for each attribute if can be UNKNOWN = 0.5 (nullable attribute) or not TRUE = 1 • How does SQL cope with tables that have NULLs ?

Magda Balazinska - CSE 344, Fall 2011 Magda Balazinska - CSE 344, Fall 2011

Null Values Null Values

• C1 AND C2 = min(C1, C2) Unexpected behavior: • C1 OR C2 = max(C1, C2) • NOT C1 = 1 – C1 SELECT * SELECT * E.g. FROM Person FROM Person age=20 WHERE age < 25 OR age >= 25 WHERE (age < 25) AND height=NULL (height > 6 OR weight > 190) weight=200 Some Person tuples are not included ! Rule in SQL: include only tuples that yield TRUE

Magda Balazinska - CSE 344, Fall 2011 Magda Balazinska - CSE 344, Fall 2011

1 Null Values Outerjoins

Can test for NULL explicitly: Product(name, category) – x IS NULL Purchase(prodName, store) – x IS NOT NULL SELECT Product.name, Purchase.store An “inner ”: FROM Product, Purchase SELECT * FROM Person WHERE Product.name = Purchase.prodName WHERE age < 25 OR age >= 25 OR age IS NULL Same as: SELECT Product.name, Purchase.store FROM Product JOIN Purchase ON Now it includes all Person tuples Product.name = Purchase.prodName But Products that never sold will be lost ! Magda Balazinska - CSE 344, Fall 2011 Magda Balazinska - CSE 344, Fall 2011

Outerjoins Product Purchase

Product(name, category) Name Category ProdName Store Purchase(prodName, store) Gizmo gadget Gizmo Wiz

Camera Photo Camera Ritz

If we want the never-sold products, need an “outerjoin”: OneClick Photo Camera Wiz

SELECT Product.name, Purchase.store Name Store FROM Product LEFT OUTER JOIN Purchase ON Gizmo Wiz

Product.name = Purchase.prodName Camera Ritz

Camera Wiz

Magda Balazinska - CSE 344, Fall 2011 OneClick NULL

Outer Joins Aggregation in SQL

sqlite3 lecture04 Specify a filename the • Left outer join: create Purchase will be stored – Include the left tuple even if there’s no match (pid int primary key, • Right outer join: product varchar(15), – Include the right tuple even if there’s no match price float, • Full outer join: quantity int, – Include both left and right tuples even if there’s no Other DBMSs have match month varchar(15)); other ways of imporng data .import data.txt Purchase Magda Balazinska - CSE 344, Fall 2011 Magda Balazinska - CSE 344, Fall 2011

2 Simple Aggregations Aggregates and NULL Values

Five basic aggregate operations in SQL Null values are not used in aggregates • count(*) Purchase • into Purchase values(11, ‘gadget’, NULL, NULL, ‘april’) • select count(quantity) from Purchase Let’s try the following: • select sum(quantity) from Purchase • select count(*) from Purchase • select avg(price) from Purchase • select count(quantity) from Purchase • select max(quantity) from Purchase • select sum(quantity) from Purchase • select min(quantity) from Purchase Except count, all aggregations apply to a single attribute

Magda Balazinska - CSE 344, Fall 2011 Magda Balazinska - CSE 344, Fall 2011

Counting Duplicates More Examples

COUNT applies to duplicates, unless otherwise stated: SELECT Sum(price * quantity) SELECT Count(product) same as Count(*) FROM Purchase FROM Purchase What do WHERE price > 4.99 SELECT Sum(price * quantity) they mean ? FROM Purchase We probably want: WHERE product = ‘bagel’ SELECT Count(DISTINCT product) FROM Purchase WHERE price> 4.99

Magda Balazinska - CSE 344, Fall 2011 Magda Balazinska - CSE 344, Fall 2011

Simple Aggregations Grouping and Aggregation

Purchase Product Price Quantity Purchase(product, price, quantity) Bagel 3 20 Find total quantities for all sales over $1, by product. Bagel 1.50 20 Banana 0.5 50 SELECT product, Sum(quantity) AS TotalSales Banana 2 10 FROM Purchase Banana 4 10 WHERE price > 1 GROUP BY product SELECT Sum(price * quantity) FROM Purchase 90 (= 60+30) Let’s see what this means… WHERE product = ‘Bagel’ Magda Balazinska - CSE 344, Fall 2011 Magda Balazinska - CSE 344, Fall 2011

3 Grouping and Aggregation 1&2. FROM-WHERE-GROUPBY

1. Compute the FROM and WHERE clauses. Product Price Quantity Bagel 3 20 2. Group by the attributes in the GROUPBY Bagel 1.50 20 3. Compute the SELECT clause: grouped attributes and aggregates. Banana 0.5 50 Banana 2 10 Banana 4 10

Magda Balazinska - CSE 344, Fall 2011 Magda Balazinska - CSE 344, Fall 2011

3. SELECT Other Examples

Product Price Quantity Compare these Bagel 3 20 Product TotalSales two queries: Bagel 1.50 20 Bagel 40 SELECT product, count(*) SELECT month, count(*) Banana 0.5 50 Banana 20 FROM Purchase FROM Purchase GROUP BY product GROUP BY month Banana 2 10 Banana 4 10 SELECT product, sum(quantity) AS SumQuantity, What does SELECT product, Sum(quantity) AS TotalSales max(price) AS MaxPrice it mean ? FROM Purchase FROM Purchase WHERE price > 1 GROUP BY product GROUP BY product Magda Balazinska - CSE 344, Fall 2011

Need to be Careful… Ordering Results

SELECT product, max(quantity) Product Price Quantity FROM Purchase Bagel 3 20 GROUP BY product SELECT product, sum(price*quantity) as rev Bagel 1.50 20 FROM purchase SELECT product, quantity GROUP BY product Banana 0.5 50 FROM Purchase rev desc GROUP BY product Banana 2 10 Banana 4 10 Sqlite is WRONG on this query. SQL Server correctly gives an error Magda Balazinska - CSE 344, Fall 2011

4 HAVING Clause WHERE vs HAVING

Same query as earlier, except that we consider only products • WHERE is applied to individual rows that had at least 30 sales. – The rows may or may not contributed to the aggregate – No aggregates allowed here SELECT product, Sum(quantity) FROM Purchase • HAVING condition is applied to the entire group WHERE price > 1 GROUP BY product – Entire group is returned, or not al all HAVING Sum(quantity) > 30 – May use aggregate functions in the group

HAVING clause contains conditions on aggregates.

Magda Balazinska - CSE 344, Fall 2011 Magda Balazinska - CSE 344, Fall 2011

Aggregates and Joins Aggregate + Join Example

create table Product (pid int primary key, pname What do these varchar(15), manufacturer varchar(15)); SELECT x.manufacturer, count(*) query mean? FROM Product x, Purchase y insert into product values(1, 'bagel', 'Sunshine Co.'); WHERE x.pname = y.product insert into product values(2, 'banana', 'BusyHands'); GROUP BY x.manufacturer insert into product values(3, 'gizmo', 'GizmoWorks'); SELECT x.manufacturer, y.month, count(*) insert into product values(4, 'gadget', 'BusyHands'); FROM Product x, Purchase y insert into product values(5, 'powerGizmo’, 'PowerWorks'); WHERE x.pname = y.product GROUP BY x.manufacturer, y.month

Magda Balazinska - CSE 344, Fall 2011 Magda Balazinska - CSE 344, Fall 2011

General form of Grouping Semantics and Aggregation SELECT S SELECT S

FROM R1,…,Rn FROM R1,…,Rn WHERE C1 WHERE C1

GROUP BY a1,…,ak Why ? GROUP BY a1,…,ak HAVING C2 HAVING C2 Evaluation steps: S = may contain attributes a1,…,ak and/or any aggregates but NO OTHER ATTRIBUTES 1. Evaluate FROM-WHERE, apply condition C1

C1 = is any condition on the attributes in R1,…,Rn 2. Group by the attributes a1,…,ak C2 = is any condition on aggregate expressions 3. Apply condition C2 to each group (may have aggregates) and on attributes a1,…,ak 4. Compute aggregates in S and return the result Magda Balazinska - CSE 344, Fall 2011 Magda Balazinska - CSE 344, Fall 2011

5 Empty Groups Empty Groups: Example

• In the result of a query, there is one SELECT product, count(*) SELECT product, count(*) per group in the result FROM purchase FROM purchase GROUP BY product WHERE price > 2.0 • No group can be empty! What if there GROUP BY product • In particular, count(*) is never 0 are no purchases for a manufacturer SELECT x.manufacturer, count(*) 4 groups in our example dataset FROM Product x, Purchase y WHERE x.pname = y.product 3 groups in our example dataset GROUP BY x.manufacturer

Magda Balazinska - CSE 344, Fall 2011 Magda Balazinska - CSE 344, Fall 2011

Empty Group Solution: Empty Group Problem Outer Join

What if there are no SELECT x.manufacturer, count(y.pid) purchases for a manufacturer FROM Product x LEFT OUTER JOIN Purchase y SELECT x.manufacturer, count(*) ON x.pname = y.product FROM Product x, Purchase y GROUP BY x.manufacturer WHERE x.pname = y.product GROUP BY x.manufacturer

Magda Balazinska - CSE 344, Fall 2011 Magda Balazinska - CSE 344, Fall 2011

6