Warning: Null value is eliminated by an aggregate or other SET operation

Warning: Null value is eliminated by an aggregate or other SET operation

In data warehousing where aggregation (GROUP BY) is one of the most common things done in queries, we often get the warning:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
Warning: Null value is eliminated by an aggregate or other SET operation
Warning: Null value is eliminated by an aggregate or other SET operation
Warning: Null value is eliminated by an aggregate or other SET operation

In most cases what is happening is harmless. But what is happening?

NULL in aggregation

First we have to better understand NULL, I’ve already looked at Managing update differences: NULL and NOT NULL. NULL is both amazingly useful, and from time to time the thorn in the side of anyone working with databases.

Have a look at the below example:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
WITH data AS (
SELECT CAST(1 AS INT) AS c1, CAST(1 AS INT) AS c2 UNION ALL
SELECT CAST(1 AS INT) AS c1, CAST(2 AS INT) AS c2 UNION ALL
SELECT CAST(1 AS INT) AS c1, CAST(NULL AS INT) AS c2
)
SELECT SUM(c1) AS [SUM_c1],
COUNT(c1) AS [COUNT_c1],
SUM(c2) AS [SUM_c2], -- Note: This causes the "Warning"
COUNT(c2) AS [COUNT_c2], -- Note: This causes the "Warning"
COUNT(*) AS [COUNT_all] -- Note: This gives a different result to above
FROM data
;
WITH data AS ( SELECT CAST(1 AS INT) AS c1, CAST(1 AS INT) AS c2 UNION ALL SELECT CAST(1 AS INT) AS c1, CAST(2 AS INT) AS c2 UNION ALL SELECT CAST(1 AS INT) AS c1, CAST(NULL AS INT) AS c2 ) SELECT SUM(c1) AS [SUM_c1], COUNT(c1) AS [COUNT_c1], SUM(c2) AS [SUM_c2], -- Note: This causes the "Warning" COUNT(c2) AS [COUNT_c2], -- Note: This causes the "Warning" COUNT(*) AS [COUNT_all] -- Note: This gives a different result to above FROM data ;
WITH data AS (
SELECT CAST(1 AS INT) AS c1, CAST(1 AS INT)    AS c2 UNION ALL
SELECT CAST(1 AS INT) AS c1, CAST(2 AS INT)    AS c2 UNION ALL
SELECT CAST(1 AS INT) AS c1, CAST(NULL AS INT) AS c2
)
SELECT SUM(c1)   AS [SUM_c1],
       COUNT(c1) AS [COUNT_c1],
       SUM(c2)   AS [SUM_c2],    -- Note: This causes the "Warning"
       COUNT(c2) AS [COUNT_c2],  -- Note: This causes the "Warning"
       COUNT(*)  AS [COUNT_all]  -- Note: This gives a different result to above
  FROM data
;
SUM_c1 COUNT_c1 SUM_c2 COUNT_c2 COUNT_all
3 3 3 2 3

Note: the warning comes from this line: SUM(c2)

The warning is telling us that the NULL value is being eliminated (thrown out) by the aggregation. So in this instance the value might as well be 0.

The danger of NULL in aggregation

NULL in COUNT

In the above example, the more interesting threat to processing and results is the COUNT(c2) as when specifying a column with a NULL, the value is discarded and no count is returned.

Note that COUNT(*) is safe when trying to get a count of rows in an aggregation.

NULL in addition

NULL in addition (as part of a row level expression) when added to another number returns NULL, see this example:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
SELECT 1 + 1 AS [c1],
1 + NULL AS [c2], -- The problem
1 + COALESCE(NULL, 0) AS [c3] -- The simple solution
;
SELECT 1 + 1 AS [c1], 1 + NULL AS [c2], -- The problem 1 + COALESCE(NULL, 0) AS [c3] -- The simple solution ;
SELECT 1 + 1                 AS [c1],
       1 + NULL              AS [c2], -- The problem
       1 + COALESCE(NULL, 0) AS [c3]  -- The simple solution
;
c1 c2 c3
2 NULL 1

On a row by row basis we will see it quickly. But as shown with the SUM above (generating the warning) when combined with an aggregation that NULL can get easily lost:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
WITH data AS (
SELECT 1 + 1 AS [c1],
1 + NULL AS [c2], -- The problem
1 + COALESCE(NULL, 0) AS [c3] -- The simple solution
UNION ALL
SELECT 1 AS [c1],
1 AS [c2],
1 AS [c3]
)
SELECT SUM([c1]) AS [c1_sum],
SUM([c2]) AS [c2_sum], -- This is incorrect
SUM([c3]) AS [c3_sum] -- This is what we want
FROM data
;
WITH data AS ( SELECT 1 + 1 AS [c1], 1 + NULL AS [c2], -- The problem 1 + COALESCE(NULL, 0) AS [c3] -- The simple solution UNION ALL SELECT 1 AS [c1], 1 AS [c2], 1 AS [c3] ) SELECT SUM([c1]) AS [c1_sum], SUM([c2]) AS [c2_sum], -- This is incorrect SUM([c3]) AS [c3_sum] -- This is what we want FROM data ;
WITH data AS (
SELECT 1 + 1                 AS [c1],
       1 + NULL              AS [c2], -- The problem
       1 + COALESCE(NULL, 0) AS [c3]  -- The simple solution
UNION ALL
SELECT 1                     AS [c1],
       1                     AS [c2],
       1                     AS [c3]
)
SELECT SUM([c1]) AS [c1_sum],
       SUM([c2]) AS [c2_sum], -- This is incorrect
       SUM([c3]) AS [c3_sum]  -- This is what we want
  FROM data
;
c1_sum c2_sum c3_sum
3 1 2

In the above example, the value for c2_sum is probably incorrect for what most people would want/expect as it’s not including the first value being added.

For c3 (and c3_sum) the use of COALESCE replaces the NULL with 0 and returns the expected result.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.