From Roger's Access Blog
Total Page:16
File Type:pdf, Size:1020Kb
From Roger's Access Blog Select Queries
By Roger J. Carlson
Contents Introduction...... 2
SELECT Clause...... 2
FROM Clause...... 3
WHERE Clause...... 4
Examples of simple Where clauses:...... 4
Multiple Criteria and Logical Operators...... 5
NOT Operator...... 8
ORDER BY Clause...... 8
GROUP BY and HAVING Clauses...... 10
PREDICATE (optional)...... 11
Top Predicate...... 11
DISTINCT Predicate...... 12
DISTINCTROW...... 13
TRANSFORM...PIVOT (Crosstab Query)...... 13
PARAMETERS Clause (Optional)...... 15 Introduction The most common type of query is the Select Query. Its purpose is to return a dataset that is a dynamic view into your database. In fact, in other SQL dialects, like T-SQL (SQL Server) or SQL*Plus (Oracle) a saved Select Query is called a View.
The Select Query is very flexible and allows you to see your data in a variety of ways.
You can specify which columns you want to see.
You can also restrict the rows returned based on one or more criteria.
You can join multiple tables together on a common field(s).
You can return unique records, the top or bottom number (or percentage), sort your data on one or more columns.
And you can make your query interactive by adding parameters.
A Select Query can be created in the Query Builder View or directly in the SQL View. Before Access 2007, you would create a new query by going to the Database Window, select the Queries tab, and click New. Since Access 2007, go to the Create Tab on the Ribbon and select Query Design. In Access 2007, it looks like Figure 1. The top pane shows which table or tables on which the query is based. The bottom pane shows which fields will be used in the query and the criteria used to restrict the rows.
[insert SelectFigure1.jpg]
Switching to SQL View will result in the above query to look like this:
SELECT Products.ProductName, Products.Cost, Products.Price FROM Products WHERE (((Products.Cost)>=5)) ORDER BY Products.ProductName;
It is fairly easy to learn SQL by creating a query in the Query Builder and switching to SQL View. So I'm going to talk about what the various sections of a Select Query are.
In general, a Select Query will look like this:
SELECT [optional Predicate] [Field List] FROM [Table/Query/Join] WHERE [criteria to restrict rows] ORDER BY [Field List] ASC/DESC SELECT Clause The Select Clause identifies which fields will be displayed in the result set of the query. The Query Builder always qualifies the field name with the table name, which is not strictly necessary unless the field name exists in multiple tables. This will be important when we talk about Joins, but for now, the Select Clause could also look like this:
SELECT ProductName, Cost, Price You can also create calculated columns in the Select Clause. For instance, suppose I wanted to know the difference between the cost and the price of a product. I could add a calculation to a new column. In the Query Builder, it would look like this:
[insert SelectFigure2.jpg]
Which would look like this in SQL:
SELECT ProductName, Cost, Price, [Price]-[Cost] AS Margin
You have to give the calculated column a name also called an 'alias', which I decided to call "Margin". In the Query Builder, it goes in front of the calculation. In SQL View, it follows the calculation with the AS keyword. The square brackets ([]) are not strictly necessary here, but there are times that they are. For instance, if you had a space in your field name (Product Name), you would be required to put square brackets around it ( [Product Name] ).
A word about naming: While Access SQL will allow you to use spaces, odd characters, and reserved words as field names by surrounding them in brackets, there are times when this will get you into trouble. Therefore it is best practice to avoid field and table names which will require brackets. You won't go wrong if you name your objects with letters, numbers, and the underscore only and avoid words like Date, Month, Table and other reserved words. You can find a list of reserved words here: http://support.microsoft.com/kb/286335.
You can also select ALL columns with the asterisk:
SELECT *
The asterisk is not the most efficient way of specifying columns, especially in a large table with many columns. If performance is an issue, you might want to restrict your query to just the columns you need. On the other hand, if your base table structure changes frequently, the asterisk can be useful because you don't have to change your query when you add or remove a column. FROM Clause The From Clause specifies the table or tables that form the basis of the query. Ultimately, of course, all the data comes tables, but you are not limited to specifying table names. You can also specify saved queries just as if they were tables. This is useful for creating "stacked" queries, that is, queries that build upon a previous query or queries.
In the Query Builder, the From clause is represented by "tables" in the upper pane (see Figure 1). To add another table or query, right-click in the upper pane and select Add Table in the pop-up box.
In SQL View, the from clause looks like this:
FROM Products (Products being a table in our database)
Or
FROM qryProductsSold (assuming qryProductsSold is a saved query)
Adding the From clause to the query above yields: SELECT ProductName, Cost, Price, [Price]-[Cost] AS Margin FROM Products
Another very important function of the From clause is to create Joins, that is, merge the records of two or more tables on common fields. For instance, if I wanted to see all of my Orders and their corresponding Order Details, I can join the Order table and the Order Details table. Figure 3 shows the Query Builder View.
[Insert SelectFigure3.jpg]
The SQL View looks like this:
FROM Orders INNER JOIN OrderDetails ON Orders.OrderID = OrderDetails.OrderID;
In this case, the fully qualified field name (that is with the table name in front of the field name) is required because the OrderID field exists in both tables.
There's a lot more to be said about Joins, and I'll do that in a later post.
Lastly, it is also possible to use an entire SQL statement in a From Clause by simply putting it in parentheses followed by an alias. In the SQL View, this would be perfectly legal:
SELECT FullName, Address, City, State, Zip FROM (SELECT [FirstName] & " " & [Lastname] AS FullName, Address, City, State, Zip FROM Customer) AS FullCustomer;
This example is not particularly useful, but it does demonstrate how a SQL Statement can be used as a result set in the From clause. You have to be careful, however, because Access doesn't always handle this well. I'll also talk about that in a later post. WHERE Clause The Where Clause is one of the most powerful features of a query. It allows you to restrict the rows returned based one or more criteria. These criteria are in the form of an expression, the general form of which is:
[Field]
The field is a column in the base table(s) or a calculated column. It does not have to be in the Field List. The comparison operators are the standard math comparators: =, <, >, <=, >=, or <>; with some SQL specific ones added: IN/EXISTS, BETWEEN, LIKE, and IS NULL. The value portion can be either a hard coded value (like "hammer" or 25), or it can be another field or expression.
The Where clause follows the From Clause in a SQL Statement like so:
SELECT ProductName, Cost, Price, [Price]-[Cost] AS Margin FROM Products WHERE ProductName = "hammer"
Examples of simple Where clauses: WHERE ProductName = "hammer" (to show all hammers) WHERE ProductName <> "hammer" (to show all products EXCEPT hammers WHERE Cost <= 0.05 WHERE Price < Cost (products sold below cost) WHERE [Price] - [Cost] > 2 * [Cost] (where the margin is greater than twice the cost)
LIKE is used with character data only and uses the asterisk as a wildcard symbol:
WHERE ProductName LIKE "ham*" (returns "hammer" and "hammock") WHERE ProductName LIKE "*nail" (returns "10p nail" and "8p nail")
IN allows you to test if a field matches one of a list of values.
WHERE Cost IN (1, 2, 5, 8)
IN can also contain another SQL statement. The SQL statement contained in the IN must have only one field in the Field List.
WHERE ProductName IN (SELECT ProductName FROM Products)
I'll talk more about this in a later post when I discuss subqueries.
BETWEEN allows you to test for a range of values:
WHERE Cost BETWEEN 5 AND 10 WHERE BeginDate BETWEEN #1/1/2008# AND #12/31/2008#
IS NULL is a special comparator that tests whether or not a field is Null. The other comparison operators do not work with Null, so IS NULL is the only way to test for it.
WHERE ProductName IS NULL
Multiple Criteria and Logical Operators You can also have multiple criteria by using multiple expressions joined by the Logical Operators: AND/OR. Examples of simple multiple criteria:
WHERE ProductName = "10p nail" OR ProductName = "8p nail" WHERE ProductName = "saw" AND Cost BETWEEN 5 AND 10 WHERE ProductName = "saw" OR Cost BETWEEN 5 AND 10
Unfortunately, the Query Builder View of Where clauses (or Criteria as it's called in the QB) looks quite different than in the SQL View. In the QB, you do not need to repeat the field name with an OR or AND statement like you do in SQL. For instance, if I wanted to display both 10p nails and 8p nails, my Where clause would look like this:
WHERE ProductName = "10p nail" OR ProductName = "8p nail"
But the Query builder would look like this: Of course, this would also work:
You have to be careful when creating multiple criteria in the Query Builder because sometimes they don't say exactly what you think they do. It matters which lines in the Criteria that you put your expressions. You create ORs on separate lines, while you create ANDs on the same line.
So this: WHERE ProductName = "saw" OR Cost BETWEEN 5 AND 10
Translates to:
(Notice the criteria are on separate lines.)
But this: WHERE ProductName = "saw" AND Cost BETWEEN 5 AND 10
Translates to: (Notice here the criteria are on the same line.)
It gets even more complicated with multiple ANDs and Ors. For instance, what does this statement mean?
WHERE ProductName = "saw" OR ProductName = "hammer" AND Cost BETWEEN 5 AND 10
You might think it means I want "saws and hammers with a cost between 5 and 10". However, it doesn't. There is an Order of Precedence to the logical operators as there are with arithmetic symbols. AND always takes precedence over OR. So what will really be returned is "ALL saws and only those hammers that cost between 5 and 10". The AND expression will be evaluated before the OR expression.
So how do I get what I want? The answer is parentheses. If I want something evaluated out of precedence (like my OR expression) I surround it with parentheses. So it would be this:
WHERE (ProductName = "saw" OR ProductName = "hammer") AND Cost BETWEEN 5 AND 10
In the Query Builder, it would look like this:
One last thing about the Query Builder and Where clauses. The QB overdoes it with table names and parentheses. If I create the following query in the Query Builder: I will get the following SQL statement:
SELECT Products.ProductName, Products.Price, FROM Products WHERE (((Products.ProductName)="saw" Or (Products.ProductName)="hammer") AND ((Products.Cost) In (1,2,5,7)));
Because there is only one table in the query, the table name preceding every field is not necessary. However, it also puts in parentheses, which, while it makes the query technically correct, actually makes it harder to read. Many of these parentheses can be removed, but not all. Naturally, the parentheses that surround the IN list must remain and also the OR statement that we know we want to take precedence. The query can be cleaned to look like this:
SELECT ProductName, Price FROM Products WHERE (ProductName ="saw" Or ProductName= "hammer") AND Cost In (1,2,5,7);
Because of the possibility of error when creating complex Where clauses in the Query Builder, I usually create them in the SQL View where I know I can control the parentheses.
NOT Operator There is one more important Logical Operator, and that's the NOT operator. The NOT operator reverses an expression. It can be used with the LIKE, IN, and BETWEEN operators or entire multiple-expression Where clauses. It returns all rows EXCEPT those that would have been returned if you hadn't used the NOT.
WHERE ProductName NOT LIKE "ham*" WHERE Cost NOT IN (1, 2, 5, 8) WHERE Cost NOT BETWEEN 5 AND 10 WHERE Cost NOT IS NULL It's important to remember that NOT also reverses both the comparison operators and the other logical operators, so you need to be really careful. For instance,
WHERE NOT(ProductName="saw" Or ProductName="hammer")
Is equivalent to:
WHERE (ProductName<>"saw" AND ProductName<>"hammer") ORDER BY Clause Tables in relational databases like Access (or SQL Server or Oracle, for that matter) do not have any intrinsic order. This means they can be returned in any order, not necessarily the order they were entered or that they appear when you open the table. So if you want them in a particular order, you have to sort them yourself. You do that in a query in the ORDER BY clause.
The ORDER BY clause has two parts: [field] [ASC/DESC] (repeat for as many fields as you need).
The field designates the field on which the sort will be performed and ASC/DESC tells how the field will be sorted. ASC means ascending (smallest to largest). DESC means descending (largest to smallest). ASC is the default, so if you don't designate an order, it will be smallest to largest.
Letters are sorted alphabetically A-Z (or Z-A if DESC). Numbers are sorted numerically. This is all very obvious until you have numbers that are stored as text. While they look like numbers, they will not sort numerically. For instance, these character strings sorted alphabetically:
CustomerNum ------101 102 1100 201 3001 301
One solution is to add leading zeros to your text numbers. Numeric data will not save leading zeros, but text will:
CustomerNum ------0101 0102 0201 0301 1100 3001
The Order By clause follows the Where clause:
SELECT ProductName, Cost, Price, [Price]-[Cost] AS Margin FROM Products WHERE ProductName = "hammer" ORDER BY Cost DESC Examples of ORDER BY clauses:
ORDER BY ProductName (Products alphabetically A-Z) ORDER BY [Price]-[Cost]DESC (margin highest to lowest) ORDER BY ProductName, Cost DESC
The last example shows that you can also sort on multiple fields. In this case, the table will be sorted on ProductName (ascending) and within each group of products, it will be sorted on Cost (descending).
ProductName Cost ------Ax $3 Hammer $15 Hammer $10 Wrench $5 Wrench $4
The fields in the ORDER BY clause do not have to be in the Field List of the SELECT clause. For instance, this will work just fine:
SELECT ProductName FROM Products ORDER BY Cost
In the Query Builder, simply uncheck the checkbox to do this:
You can also sort on an expression. I showed [Price]-[Cost], but you can also apply functions to your fields. For instance, earlier I showed how text numbers will sort incorrectly. I showed how leading zeros will correct it, but another solutions is to display the text numbers as they are in the field list, but to add a function to the ORDER BY clause that converts them to numeric.
SELECT [CustomerNum] FROM Customer ORDER BY CLng([CustomerNum]);
CLng () is a built-in function that converts Text to Long Integer. GROUP BY and HAVING Clauses The GROUP BY and HAVING clauses are used with "Totals" or Aggregate queries.
The GROUP BY allows you to group your data and apply a function to the grouped data. For instance, SELECT ProductName, Cost FROM Products
Will result in the following:
ProductName Cost ------Ax $3 Wrench $4 Hammer $15 Wrench $5 Hammer $10
However, if I wanted to see the average cost for each group, I could add a GROUP BY clause:
SELECT ProductName, Cost FROM Products GROUP BY ProductName, Avg(Cost)
ProductName Cost ------Ax $3 Wrench $ 4.5 Hammer $ 12.5
Every field in the Field List MUST be represented in the GROUP BY clause either to group on or with an aggregate function. Examples of aggregate functions are: Sum, Avg, Min, Max, and Count.
The HAVING clause works like the WHERE clause, but it restricts rows after they have been grouped.
SELECT ProductName, Cost FROM Products GROUP BY ProductName, Avg(Cost) HAVING Avg(Cost)>4
Will return:
ProductName Cost ------Wrench $ 4.5 Hammer $ 12.5
I will discuss Totals Queries in greater depth in a later post. PREDICATE (optional) Another less well known section of a SQL statement is the Predicate. The Predicate follows the SELECT keyword and precedes the Field List. There are two important predicates: TOP and DISCTINCT.
Top Predicate The Top Predicate allows you to display the top(x) values from a query in terms of either a number or percentage. I can't really talk about the TOP predicate without discussing the ORDER BY clause because the rows displayed are determined by the ORDER BY clause. Examples:
SELECT TOP 10 ProductName FROM Products ORDER BY Cost
This will display ten rows of ProductNames whose costs are the LOWEST.
SELECT TOP 25% ProductName FROM Products ORDER BY Cost DESC
This will display 25% of the total number of rows of the Product table whose costs are HIGHEST.
In other words, the query creates a standard Select Query, applies the sort order in the ORDER BY clause, then displays just the top X values from the query.
To add a TOP predicate in the Query Builder, go to the Properties of the Query and look for the Top Values property:
A word about duplicates. If there are duplicates in Top rows, they will all be displayed. So, sometimes Top 10 could return 11 or even more rows. It is possible to remove these duplicates with a subquery. To do that and a lot of interesting things with the TOP predicate, download a free database sample from my website called: TopQuery which goes into more detail including:
Removing Duplicates.
Top values with Aggregates (Totals Query)
Top values per Group
Returning Random X records from your table.
User input TOP value (parameter). DISTINCT Predicate The DISTINCT predicate will remove duplicates from your result set based on the fields in your Field List. For instance, if I had a Products table that looked like this:
ProductName Cost ------Ax $3 Hammer $15 Hammer $10 Wrench $5 Wrench $4
SELECT DISTINCT ProductName FROM Products
Would return:
ProductName ------Ax Hammer Wrench
Removing the duplicated rows. The effect is the same as using GROUP BY clause without an aggregate function:
SELECT ProductName FROM Products GROUP BY ProductName;
DISTINCTROW The DISTINCTROW predicate is unique to Access. It will return unique records across the entire record, not just the fields in the Field List.
SELECT DISTINCTROW ProductName FROM Products returns
ProductName ------Ax Hammer Hammer Wrench Wrench
Honestly, though, I've never found a good use for DISTINCTROW because I always have a Primary Key in all my tables, so all of the rows are already unique. To create either the DISTINCT predicate in the Query Builder, set the Unique Values property to Yes, to create the DISTINCTROW predicate set the Unique Rows value to Yes.
TRANSFORM...PIVOT (Crosstab Query) The Crosstab Query is an Access Specific query that allows you to display recordset results in a much more compact form. For instance, if I had an OrderDetail table that looked like this:
I could use a Crosstab Query to show the data like this:
[include CrosstabFigure2.jpg]
The Crosstab Query used two clauses, the TRANSFORM, which comes before the SELECT statement, and the PIVOT, which follows the GROUP BY. In SQL View, it looks like this:
TRANSFORM Sum(Quantity) AS SumOfQuantity SELECT CustomerNum, OrderID FROM OrderDetail GROUP BY CustomerNum, OrderID PIVOT ProductName;
In the Query Builder, it looks like this:
The easiest way to learn Crosstab Queries is to use the Crosstab Wizard that's available when you create a New Query. PARAMETERS Clause (Optional) Queries can also accept user input. This input is called a Parameter. Parameters are placed in the WHERE clause of a SQL statement and takes the form of a Prompt in square brackets. For example:
SELECT * FROM Customers WHERE CustomerNum [Enter Customer Number]
When the query is run, a parameter input box appears:
When the user inputs a valid customer number, the query returns the row(s) applicable to the customer entered.
A parameter can also reference a control on a form. Suppose I have a Customer form and I want to see all of the orders associated with a particular customer. I could have a query that looks at the CustomerNum in the current form and return all of the order numbers for that customer. So if I had a query that looked like this:
SELECT CustomerNum, OrderNumber FROM Customer INNER JOIN Orders ON Customer.CustomerID = Orders.CustomerID WHERE CustomerNum=[forms]![frmCustomer]![CustomerNum]
It would look at the value in the CustomerNum text box on the frmCustomer form and use that as the parameter value to return all the orders for customer 100.
Access does a pretty good job of figuring out what the data type of a parameter is. However, there are times when it gets confused. This is particularly true when trying to use parameters with a Crosstab query. In these cases, you can define what the data type of the parameter will be with the PARAMETERS Clause.
The PARAMETERS clause go before the SELECT statement in a Select query:
PARAMETERS [Enter Customer Num] Text ( 255 ); SELECT Customer.* FROM Customer WHERE CustomerNum=[Enter Customer Number];
Notice the semi-colon at the end.
In a Crosstab query, it goes before the TRANSFORM statement:
PARAMETERS [Enter CustomerNum] Text ( 255 ); TRANSFORM Sum(Quantity) AS SumOfQuantity SELECT CustomerNum, OrderID FROM OrderDetail WHERE CustomerNum=[Enter CustomerNum])) GROUP BY CustomerNum, OrderID PIVOT ProductName; To create parameters in the Design View, choose Parameters from the Design Ribbon (A2007) or from the Query>Parameters Menu (A2003 and before). You will get a dialog box that looks something like this: