Warning: Null value is eliminated by an aggregate or other SET operation
In data warehousing where aggregation (GROUP BY) is one of the most common things done in queries, we often get the warning:
Warning: Null value is eliminated by an aggregate or other SET operation
In most cases what is happening is harmless. But what is happening?
NULL in aggregation
First we have to better understand NULL, I’ve already looked at Managing update differences: NULL and NOT NULL. NULL is both amazingly useful, and from time to time the thorn in the side of anyone working with databases.
Have a look at the below example:
WITH data AS (
SELECT CAST(1 AS INT) AS c1, CAST(1 AS INT) AS c2 UNION ALL
SELECT CAST(1 AS INT) AS c1, CAST(2 AS INT) AS c2 UNION ALL
SELECT CAST(1 AS INT) AS c1, CAST(NULL AS INT) AS c2
)
SELECT SUM(c1) AS [SUM_c1],
COUNT(c1) AS [COUNT_c1],
SUM(c2) AS [SUM_c2], -- Note: This causes the "Warning"
COUNT(c2) AS [COUNT_c2], -- Note: This causes the "Warning"
COUNT(*) AS [COUNT_all] -- Note: This gives a different result to above
FROM data
;
| SUM_c1 | COUNT_c1 | SUM_c2 | COUNT_c2 | COUNT_all |
| 3 | 3 | 3 | 2 | 3 |
Note: the warning comes from this line: SUM(c2)
The warning is telling us that the NULL value is being eliminated (thrown out) by the aggregation. So in this instance the value might as well be 0.
The danger of NULL in aggregation
NULL in COUNT
In the above example, the more interesting threat to processing and results is the COUNT(c2) as when specifying a column with a NULL, the value is discarded and no count is returned.
Note that COUNT(*) is safe when trying to get a count of rows in an aggregation.
NULL in addition
NULL in addition (as part of a row level expression) when added to another number returns NULL, see this example:
SELECT 1 + 1 AS [c1],
1 + NULL AS [c2], -- The problem
1 + COALESCE(NULL, 0) AS [c3] -- The simple solution
;
| c1 | c2 | c3 |
| 2 | NULL | 1 |
On a row by row basis we will see it quickly. But as shown with the SUM above (generating the warning) when combined with an aggregation that NULL can get easily lost:
WITH data AS (
SELECT 1 + 1 AS [c1],
1 + NULL AS [c2], -- The problem
1 + COALESCE(NULL, 0) AS [c3] -- The simple solution
UNION ALL
SELECT 1 AS [c1],
1 AS [c2],
1 AS [c3]
)
SELECT SUM([c1]) AS [c1_sum],
SUM([c2]) AS [c2_sum], -- This is incorrect
SUM([c3]) AS [c3_sum] -- This is what we want
FROM data
;
| c1_sum | c2_sum | c3_sum |
| 3 | 1 | 2 |
In the above example, the value for c2_sum is probably incorrect for what most people would want/expect as it’s not including the first value being added.
For c3 (and c3_sum) the use of COALESCE replaces the NULL with 0 and returns the expected result.