SQL's NULL values: comparing, sorting, converting and joining with real values
NULL value is a major point of confusion. The premise of
NULL is to represent an unknown or missing value, though it's only a placeholder and not really a value by itself. A single
NULL value can show up anywhere and torpedo your calculations. This perculiarity shows up when you try to compare it with anything else, including itself.
For example, look at these statements:
||NULL||NULL isn't comparable to NULL|
||NULL||NULL can't be compared|
||NULL||NULL can't be compared|
||NULL||NULL can't be equated|
||NULL||Are you expected a Division by Zero error?|
||NULL||NULL can't be used in boolean logic|
||TRUE||Surprising and database dependent!|
Where are you finding
Sure, your database schema has clean data with the proper
NOT NULL constraints and any individual row does not have
NULL values. Though your original data does not have NULLs, there are two cases where your query can generate rows with NULL values:
Outer joins, when you want to fill gaps in your data. Unlike an inner join that only returns data when the join condition matches at least one row from either table, an outer join returns all rows from the tables referenced in the join as long as those rows match any of the join conditions.
Certain aggregate functions such as a
SUM– for example, the sum of an empty list is
NULL values are a fact of life, and we should learn to deal with them.
1. Comparing NULL values
Say you have a table with a NULLable column type, and you want to find rows with a
NULL value in that column. Since you can't use a equality operator in the
WHERE clause (remember,
NULL values can't be equated or compared), the right way to compare
NULL values is to use the
IS NOT operators.
SELECT nullable_column FROM data_table WHERE nullable_column IS NULL;
SELECT nullable_column FROM data_table WHERE nullable_column IS NOT NULL;
1b. Additionally, Use
IS DISTINCT FROM to treat
NULL as a known value
PostgreSQL has two comparison statements
IS DISTINCT FROM and
IS NOT DISTINCT FROM that specially treats
NULL values as if it were a known value. The truth table with this expression for two nullable columns
b will look something like this:
|a||b||is distinct from|
|not NULL||not NULL||use the regular not equal '<>' operator|
2. Sorting NULL values
When trying to sort a column with the
ORDER BY clause on a nullable column, you'll find that
NULL values come last. If you tried a descending sort via the
NULL values will come first.
Fortunately, you have more control over this. There are two ways to tell your database where you want the
NULL values to appear.
First, you can specify
NULLS FIRST or
NULLS LAST in your
ORDER BY clause to specify where you want the
NULL values. Consider a table
users where the
gender column is nullable.
SELECT first_name, age, gender FROM users ORDER BY gender DESC NULLS LAST;
Second, you can convert the NULL values into a real value during query time or only when sorting. We'll make use of the
COALESCE function for this:
-- treat NULL in gender column as -1 SELECT first_name, age, COALESCE(gender, -1) FROM users ORDER BY 3; -- treat NULL in gender column as -1 only during sorting SELECT first_name, age, gender FROM users ORDER BY COALESCE(gender, -1);
COALESCE to change
NULL values into real values
COALESCE function takes a number of arguments and returns the first non-NULL argument. If all of the arguments are
NULL, then the
NULL. For example,
SELECT COALESCE('love', 'sql'); -- returns love SELECT COALESCE(NULL, 'love', 'sql'); -- returns love
COALESCE is a shortcut for
COALESCE function is a syntactic shortcut for the
CASE statement. For example, the code
COALESCE(expr1, expr2, ..., exprN) is rewritten by the query parser as the following
CASE WHEN (expr1 IS NOT NULL) THEN expr1 WHEN (expr2 IS NOT NULL) THEN expr2 ... ELSE exprN END
4. Joining tables on Nullable Columns
Say you have two tables:
table2 with a column that can have
NULL values. If you were to join these two tables on that value, you have to be aware of the fact that
NULL is not equal to
NULL, and therefore, it would appear as though you are missing rows from the resulting joined table.
You are reading a series of Advanced SQL tricks.
Read this next: Picking the Right Chart Type
Join 300+ data analysts who are using our weekly SQL tips to improve their skills. No spam, promise!