
Session: G12 Null and Void? Dealing with Nulls in DB2 Craig S. Mullins President & Principal Consultant Mullins Consulting, Inc. http://www.CraigSMullins.com Thursday, May 11, 2006 • 08:30 a.m. – 09:40 a.m. Platform: DB2 for z/OS 1 Agenda • Definition • Some History • Types of Nulls • Inapplicable versus Applicable Data • Nulls and Keys • Distinguished Nulls • Using Nulls in DB2 • Problems with Nulls • Guidance and Advice Mullins Consulting, Inc. http://www.CraigSMullins.com 2 © 2006, Mullins Consulting, Inc. 2 What is a NULL? • NULL represents the absence of a value. • It is not the same as zero or an empty string. • A null is not a “null value” – there is no value. • Maybe a “Null Lack of Value” Mullins Consulting, Inc. http://www.CraigSMullins.com 3 © 2006, Mullins Consulting, Inc. 3 What is the Difference, You Ask? • Consider the following columns: • TERMINATION_DATE – null or a valid date? • SALARY – null or zero? • SSN – non-US resident? • ADDRESS – different composition by country, so let some components be null? • HAIR_COLOR – what about bald men? Mullins Consulting, Inc. http://www.CraigSMullins.com 4 © 2006, Mullins Consulting, Inc. When are nulls useful? Well, defining a column as NULL provides a place holder for data you might not yet know. For example, when a new employee is hired and is inserted into the EMP table, what should the employee termination date column be set to? I don’t know about you, but I wouldn’t want any valid date to be set in that column for my employee record. Instead, null can be used to specify that the termination date is currently unknown. Let’s consider another example. Suppose that we also capture employee’s hair color when they are hired. Consider three potential entity occurrences: a man with black hair, a woman with unknown hair color, and a bald man. The woman with the unknown hair color and the bald man both could be assigned as null, but for different reasons. The woman’s hair color would be null meaning presently unknown; the bald man’s hair color could be null too, in this case meaning not applicable. How could you handle this without using nulls? You would need to create special values for the HairColor column that mean “bald” and “unknown.” This is possible for a CHAR column like HairColor. But what about a DB2 DATE column? All occurrences of a column assigned as a DATE data type are valid dates. It might not be possible to use a special date value to mean “unknown.” This is where using nulls can be practical. 4 Ted Codd on Nulls • Dr. E.F. Codd built the notion of nulls into the relational model, but not in the original 1970 paper • The extensions to RM/V1 to support nulls weren't defined until the 1979 paper Extending the Database Relational Model to Capture More Meaning, ACM TODS 4, No. 4 (December 1979). • He later defined two types of nulls in RM/V2 • These were called marks • A-mark: applicable, but unknown • I-mark: inapplicable (known to be inapplicable) Mullins Consulting, Inc. http://www.CraigSMullins.com 5 © 2006, Mullins Consulting, Inc. A null allows a column to be specifically empty. A null is distinct from an empty string or a number with a value of zero. Of course, a null cannot apply to primary keys, which must contain values. Most database implementations support the concept of a non-null field constraint that prevents nulls in a specific table column… as does DB2. 5 Inapplicable versus Applicable • Inapplicable: • These are perhaps better dealt with by creating better data models • Social Security Number for a European • Consulting fee for a full-time employee • Applicable, but unknown: • These are the ones that are “true” nulls • Future dates not yet known • Information not supplied • Not everyone has a middle name Mullins Consulting, Inc. http://www.CraigSMullins.com 6 © 2006, Mullins Consulting, Inc. 6 Modeling Fixes Inapplicable Columns Mullins Consulting, Inc. http://www.CraigSMullins.com 7 © 2006, Mullins Consulting, Inc. Super-type – sub-type modeling can be a better approach to inapplicable nulls. Consider the approach shown in the slide. Wouldn’t it be better to have only those columns which are always applicable in each entity/table? For example, the alternate here would be to have just the Supplier entity, but it would have a nullable Postal Code and a nullable Zip Code. For domestic suppliers postal code would be null (inapplicable); for international suppliers zip code would be blank (inapplicable). By creating the sub-types of Supplier we can model our way out of inapplicable nulls. 7 Nulls and Keys • Primary Key • A primary key cannot be null • Foreign Key(s) • Foreign keys can be null • ON DELETE SET NULL – when the PK that refers to the FK row is deleted, the FK is set to null Mullins Consulting, Inc. http://www.CraigSMullins.com 8 © 2006, Mullins Consulting, Inc. 8 Distinguished Nulls: Not all Nulls are Equal • Sometimes, based on domains and knowledge of the data a null may not really be a null. • Consider the COLOR column • With a domain of BLACK, WHITE, and RED • With a UNIQUE constraint on it • Now, say we have three rows in the table, one of which has “BLACK” for the COLOR and the other two are null. • Are those nulls really completely null? We know something about them. • What if one of the nulls is changed to WHITE? That last null is not really null, is it? • But DB2 does not automatically change it to RED… • Can you (or your program)? Mullins Consulting, Inc. http://www.CraigSMullins.com 9 © 2006, Mullins Consulting, Inc. Create a domain called Tricolor that is limited to the values 'Red', 'White', and 'Black' and a column in a table drawn from that domain with a UNIQUE constraint on it. If my table has a 'Red' and two NULL values in that column, I have some information about the two NULLs. I know they will be either ('White', 'Black') or ('Black', 'White') when their rows are resolved. This is what Chris Date calls a "distinguished NULL", which means we have some information in it. If my table has a 'Red', a 'White', and a NULL value in that column, can I change the last NULL to 'Black' because it can only be 'Black' under the rule? Or do I have to wait until I see an actual value for that row? There is no clear way to handle this in SQL. Multiple values cannot be put in a column, nor can the database automatically change values as part of the column declaration. 9 A Lot of Arguments • Some relational experts have argued against nulls (e.g. Date, Darwen, Pascal) • There are many reasons; many of them centered around the “quirks” of how nulls are implemented and the difficulty of “three-valued” logic. • Also, nulls violate “relational” – at least according to some • A null is not a value, and therefore cannot exist in a mathematical relation Mullins Consulting, Inc. http://www.CraigSMullins.com 10 © 2006, Mullins Consulting, Inc. 10 What About DB2? • Well, DB2 is not a “pure” relational DBMS • It supports and uses nulls • Let’s take a look at how DB2 uses nulls and how you use DB2 to specify nulls and manipulate nulls in your data • …as well as some of the “oddities” this can create. Mullins Consulting, Inc. http://www.CraigSMullins.com 11 © 2006, Mullins Consulting, Inc. 11 Nulls in DB2 • All columns are nullable unless defined with: • NOT NULL or NOT NULL WITH DEFAULT • DB2 uses a null-indicator to set a column to null • One byte • Zero or positive value means not null • Negative value means null • Not stored in column, but associated with column Mullins Consulting, Inc. http://www.CraigSMullins.com 12 © 2006, Mullins Consulting, Inc. All data types include the null “value”. Distinct from all non-null values, a null is a special “value” that denotes the absence of a value. Although all data types include the null, some sources of values cannot provide for nulls. For example, constants, columns that are defined as NOT NULL, and special registers cannot contain nulls; the COUNT and COUNT_BIG functions cannot return a null value; and ROWID columns cannot store a null although a null can be returned for a ROWID column as the result of a query. 12 Nulls in DB2 (continued) • Nulls do NOT save space – ever! • Nulls are not variable columns but… • Can be used with variable columns to (perhaps) save space • Programs have to be coded to explicitly deal with the possibility of null • (example on next slide) Mullins Consulting, Inc. http://www.CraigSMullins.com 13 © 2006, Mullins Consulting, Inc. Let’s take a moment to clear up a common misunderstanding right here: nulls NEVER save storage space in DB2 for OS/390 and z/OS. Every nullable column requires one additional byte of storage for the null indicator. So, a CHAR(10) column that is nullable will require 11 bytes of storage per row – 10 for the data and 1 for the null indicator. This is the case regardless of whether the column is set to null or not. In some DBMS products a null column is treated as variable length. If the column is set to null it will not take up storage space (in other words, the row becomes variable length). This is NOT the case with DB2. 13 Using Null Indicator Variables You can use null indicator variables with corresponding host variables in the following situations: • SET clause of the UPDATE statement • VALUES clause of the INSERT statement • INTO clause of the SELECT or FETCH statement Mullins Consulting, Inc.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages55 Page
-
File Size-