SQL's NULL values: comparing, sorting, converting and joining with real values
SQL's NULL value is a major point of confusion. The premise of NULL is to represent an unknown or missing value, though it's only a placeholder and not really a value by itself. A single NULL value can show up anywhere and torpedo your calculations. This perculiarity shows up when you try to compare it with anything else, including itself.
For example, look at these statements:
| Statement | Result | Comment |
|---|---|---|
SELECT NULL = NULL; |
NULL | NULL isn't comparable to NULL |
SELECT NULL > 0; |
NULL | NULL can't be compared |
SELECT NULL < 0; |
NULL | NULL can't be compared |
SELECT NULL = 0; |
NULL | NULL can't be equated |
SELECT NULL / 0; |
NULL | Are you expecting a Division by Zero error? |
SELECT NULL OR FALSE; |
NULL | NULL can't be used in boolean logic |
SELECT NULL OR TRUE; |
TRUE | Surprising and database dependent! |
Where are you finding NULL values?
Sure, your database schema has clean data with the proper NOT NULL constraints and any individual row does not have NULL values. Though your original data does not have NULLs, there are two cases where your query can generate rows with NULL values:
Outer joins, when you want to fill gaps in your data. Unlike an inner join that only returns data when the join condition matches at least one row from either table, an outer join returns all rows from the tables referenced in the join as long as those rows match any of the join conditions.
Certain aggregate functions such as a
SUM– for example, the sum of an empty list isNULL.
NULL values are a fact of life, and we should learn to deal with them.
1. Comparing NULL values
Say you have a table with a NULLable column type, and you want to find rows with a NULL value in that column. Since you can't use a equality operator in the WHERE clause (remember, NULL values can't be equated or compared), the right way to compare NULL values is to use the IS and IS NOT operators.
select nullable_column
from data_table
where nullable_column is null;
or
select nullable_column
from data_table
where nullable_column is not null;
1b. Additionally, Use IS DISTINCT FROM to treat NULL as a known value
PostgreSQL has two comparison statements IS DISTINCT FROM and IS NOT DISTINCT FROM that specially treats NULL values as if it were a known value. The truth table with this expression for two nullable columns a and b will look something like this:
| a | b | is distinct from |
|---|---|---|
| not NULL | not NULL | use the regular not equal '<>' operator |
| not NULL | NULL | return TRUE because they are different |
| NULL | not NULL | return TRUE because they are different |
| NULL | NULL | return FALSE because they are the same |
2. Sorting NULL values
When trying to sort a column with the ORDER BY clause on a nullable column, you'll find that NULL values come last. If you tried a descending sort via the DESC qualifier, NULL values will come first.
Fortunately, you have more control over this. There are two ways to tell your database where you want the NULL values to appear.
First, you can specify NULLS FIRST or NULLS LAST in your ORDER BY clause to specify where you want the NULL values. Consider a table users where the gender column is nullable.
select first_name, age, gender
from users
order by gender desc nulls last;
Second, you can convert the NULL values into a real value during query time or only when sorting. We'll make use of the COALESCE function for this:
-- treat NULL in gender column as -1
select first_name, age, coalesce(gender, -1)
from users
order by 3;
-- treat NULL in gender column as -1 only during sorting
select first_name, age, gender
from users
order by coalesce(gender, -1);
3. Use COALESCE to change NULL values into real values
The COALESCE function takes a number of arguments and returns the first non-NULL argument. If all of the arguments are NULL, then the COALESCE returns NULL. For example,
select coalesce('love', 'sql');
-- returns love
select coalesce(null, 'love', 'sql');
-- returns love
3b. Additionally, COALESCE is a shortcut for CASE
The COALESCE function is a syntactic shortcut for the CASE statement. For example, the code COALESCE(expr1, expr2, ..., exprN) is rewritten by the query parser as the following CASE statements:
case
when (expr1 is not null) then expr1
when (expr2 is not null) then expr2
...
else exprn
end
4. Joining tables on Nullable Columns
Say you have two tables: table1 and table2 with a column that can have NULL values. If you were to join these two tables on that value, you have to be aware of the fact that NULL is not equal to NULL, and therefore, it would appear as though you are missing rows from the resulting joined table.
No spam, ever! Unsubscribe any time. See past emails here.