SQL's NULL values: comparing, sorting, converting and joining with real values

SQL's NULL value is a major point of confusion. The premise of NULL is to represent an unknown or missing value, though it's only a placeholder and not really a value by itself. A single NULL value can show up anywhere and torpedo your calculations. This perculiarity shows up when you try to compare it with anything else, including itself.

For example, look at these statements:

Statement Result Comment
SELECT NULL = NULL; NULL NULL isn't comparable to NULL
SELECT NULL > 0; NULL NULL can't be compared
SELECT NULL < 0; NULL NULL can't be compared
SELECT NULL = 0; NULL NULL can't be equated
SELECT NULL / 0; NULL Are you expecting a Division by Zero error?
SELECT NULL OR FALSE; NULL NULL can't be used in boolean logic
SELECT NULL OR TRUE; TRUE Surprising and database dependent!

Where are you finding NULL values?

Sure, your database schema has clean data with the proper NOT NULL constraints and any individual row does not have NULL values. Though your original data does not have NULLs, there are two cases where your query can generate rows with NULL values:

  • Outer joins, when you want to fill gaps in your data. Unlike an inner join that only returns data when the join condition matches at least one row from either table, an outer join returns all rows from the tables referenced in the join as long as those rows match any of the join conditions.

  • Certain aggregate functions such as a SUM – for example, the sum of an empty list is NULL.

NULL values are a fact of life, and we should learn to deal with them.

1. Comparing NULL values

Say you have a table with a NULLable column type, and you want to find rows with a NULL value in that column. Since you can't use a equality operator in the WHERE clause (remember, NULL values can't be equated or compared), the right way to compare NULL values is to use the IS and IS NOT operators.

SELECT nullable_column 
FROM data_table
WHERE nullable_column IS NULL;

or

SELECT nullable_column 
FROM data_table
WHERE nullable_column IS NOT NULL;

1b. Additionally, Use IS DISTINCT FROM to treat NULL as a known value

PostgreSQL has two comparison statements IS DISTINCT FROM and IS NOT DISTINCT FROM that specially treats NULL values as if it were a known value. The truth table with this expression for two nullable columns a and b will look something like this:

a b is distinct from
not NULL not NULL use the regular not equal '<>' operator
not NULL NULL return TRUE because they are different
NULL not NULL return TRUE because they are different
NULL NULL return FALSE because they are the same

2. Sorting NULL values

When trying to sort a column with the ORDER BY clause on a nullable column, you'll find that NULL values come last. If you tried a descending sort via the DESC qualifier, NULL values will come first.

Fortunately, you have more control over this. There are two ways to tell your database where you want the NULL values to appear.

First, you can specify NULLS FIRST or NULLS LAST in your ORDER BY clause to specify where you want the NULL values. Consider a table users where the gender column is nullable.

SELECT first_name, age, gender
FROM users
ORDER BY gender DESC NULLS LAST;

Second, you can convert the NULL values into a real value during query time or only when sorting. We'll make use of the COALESCE function for this:

-- treat NULL in gender column as -1
SELECT first_name, age, COALESCE(gender, -1)
FROM users
ORDER BY 3;

-- treat NULL in gender column as -1 only during sorting
SELECT first_name, age, gender
FROM users
ORDER BY COALESCE(gender, -1);

3. Use COALESCE to change NULL values into real values

The COALESCE function takes a number of arguments and returns the first non-NULL argument. If all of the arguments are NULL, then the COALESCE returns NULL. For example,

SELECT COALESCE('love', 'sql');
-- returns love

SELECT COALESCE(NULL, 'love', 'sql');
-- returns love

3b. Additionally, COALESCE is a shortcut for CASE

The COALESCE function is a syntactic shortcut for the CASE statement. For example, the code COALESCE(expr1, expr2, ..., exprN) is rewritten by the query parser as the following CASE statements:

CASE
  WHEN (expr1 IS NOT NULL) THEN expr1
  WHEN (expr2 IS NOT NULL) THEN expr2
  ...
  ELSE exprN
END

4. Joining tables on Nullable Columns

Say you have two tables: table1 and table2 with a column that can have NULL values. If you were to join these two tables on that value, you have to be aware of the fact that NULL is not equal to NULL, and therefore, it would appear as though you are missing rows from the resulting joined table.


👋 No fuss, just SQL We are open sourcing everything from the experience working with our agency clients. They spend thousands of dollars to get this level of detailed analysis – which you can now get for free. We send one update every week. Join 400+ data analysts who are leveling up with our recipes. 👊

No spam, ever! Unsubscribe any time.