Warning: Null value is eliminated by an aggregate or other SET operation

Warning: Null value is eliminated by an aggregate or other SET operation

In data warehousing where aggregation (GROUP BY) is one of the most common things done in queries, we often get the warning:

Warning: Null value is eliminated by an aggregate or other SET operation

In most cases what is happening is harmless. But what is happening?

NULL in aggregation

First we have to better understand NULL, I’ve already looked at Managing update differences: NULL and NOT NULL. NULL is both amazingly useful, and from time to time the thorn in the side of anyone working with databases.

Have a look at the below example:

WITH data AS (
SELECT CAST(1 AS INT) AS c1, CAST(1 AS INT)    AS c2 UNION ALL
SELECT CAST(1 AS INT) AS c1, CAST(2 AS INT)    AS c2 UNION ALL
SELECT CAST(1 AS INT) AS c1, CAST(NULL AS INT) AS c2
)
SELECT SUM(c1)   AS [SUM_c1],
       COUNT(c1) AS [COUNT_c1],
       SUM(c2)   AS [SUM_c2],    -- Note: This causes the "Warning"
       COUNT(c2) AS [COUNT_c2],  -- Note: This causes the "Warning"
       COUNT(*)  AS [COUNT_all]  -- Note: This gives a different result to above
  FROM data
;
SUM_c1 COUNT_c1 SUM_c2 COUNT_c2 COUNT_all
3 3 3 2 3

Note: the warning comes from this line: SUM(c2)

The warning is telling us that the NULL value is being eliminated (thrown out) by the aggregation. So in this instance the value might as well be 0.

The danger of NULL in aggregation

NULL in COUNT

In the above example, the more interesting threat to processing and results is the COUNT(c2) as when specifying a column with a NULL, the value is discarded and no count is returned.

Note that COUNT(*) is safe when trying to get a count of rows in an aggregation.

NULL in addition

NULL in addition (as part of a row level expression) when added to another number returns NULL, see this example:

SELECT 1 + 1                 AS [c1],
       1 + NULL              AS [c2], -- The problem
       1 + COALESCE(NULL, 0) AS [c3]  -- The simple solution
;
c1 c2 c3
2 NULL 1

On a row by row basis we will see it quickly. But as shown with the SUM above (generating the warning) when combined with an aggregation that NULL can get easily lost:

WITH data AS (
SELECT 1 + 1                 AS [c1],
       1 + NULL              AS [c2], -- The problem
       1 + COALESCE(NULL, 0) AS [c3]  -- The simple solution
UNION ALL
SELECT 1                     AS [c1],
       1                     AS [c2],
       1                     AS [c3]
)
SELECT SUM([c1]) AS [c1_sum],
       SUM([c2]) AS [c2_sum], -- This is incorrect
       SUM([c3]) AS [c3_sum]  -- This is what we want
  FROM data
;
c1_sum c2_sum c3_sum
3 1 2

In the above example, the value for c2_sum is probably incorrect for what most people would want/expect as it’s not including the first value being added.

For c3 (and c3_sum) the use of COALESCE replaces the NULL with 0 and returns the expected result.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.