NULL: When a value is neither IN or NOT IN a list.

NULL: When a value is neither IN or NOT IN a list.

NULL is always a challenge for equality, I’ve written about it before. I recently produced some samples for training/explanation of NULL when it comes to IN and NOT IN, and thought I’d include it here.

In the past I have looked at NULL from the perspectives of:

NULL and Equality

The key thing to remember is that NULL is neither equal to, or not equal to NULL. As shown by the below SQL:

--------------------------------------------------------------------------------
-- Classic equality (without NULL)
--------------------------------------------------------------------------------
SELECT 1 AS output_craeted
 WHERE 1 = 1
;

--------------------------------------------------------------------------------
-- NULL is neither equal to, nor NOT equal to NULL
-- It IS either NULL or NOT NULL
--------------------------------------------------------------------------------
SELECT 2 AS output_craeted
 WHERE NULL = NULL
;

SELECT 2 AS output_craeted
 WHERE NULL <> NULL
;

SELECT 3 AS output_craeted
 WHERE NULL IS NOT NULL
;

SELECT 4 AS output_craeted
 WHERE NULL IS NULL
;

Example of IN and NOT IN

When using IN and NOT IN, the SQL underneath is creating a collection of equality tests, all of which NULL will fail.

Below are some working examples of using NULL with IN and NOT IN lists, I’m going to work with the same small set of sample data throughout the examples here.

Create sample data

--------------------------------------------------------------------------------
-- Create source data
--------------------------------------------------------------------------------
CREATE TABLE #ValueList
(
    ID     INT IDENTITY(1,1) NOT NULL PRIMARY KEY,
    val_1  VARCHAR(10),
    metric INT
);


INSERT
  INTO #ValueList
      (val_1,metric)
VALUES('a',  10),
      ('b',  20),
      ('c',  30),
      (NULL, 40)
;

-- View our sample data
SELECT *
  FROM #ValueList
;

Try using IN and NOT IN

--------------------------------------------------------------------------------
-- Hard coded list: NOT IN
--------------------------------------------------------------------------------
SELECT *
  FROM #ValueList
 WHERE val_1 NOT IN ('a', 'b') -- Skips NULL
;

SELECT *
  FROM #ValueList
 WHERE val_1 NOT IN ('a', 'b', NULL) -- Returns no results
;

--------------------------------------------------------------------------------
-- Hard coded list: IN
--------------------------------------------------------------------------------

SELECT *
  FROM #ValueList
 WHERE val_1 IN ('a', 'b') -- Skips NULL (as expected)
;

SELECT *
  FROM #ValueList
 WHERE val_1 IN ('a', 'b', NULL) -- Skips NULL (not as expected)
;

This can be seen in action over at rextester.com

Solutions

The aim is to be consistent on treating NULL as a distinct value, which is either IN or NOT IN the given list. It is key to note that in neither solution can NULL be in the list.

The solutions can also be seen in action over at rextester.com

COALESCE

If it’s just a quick query, then the simplest solution is to use COALESCE, this will allow NULL values to be set to something specifically IN or NOT IN the given list to control behaviour:

SELECT *
  FROM #ValueList
 WHERE COALESCE(val_1, 'Z') NOT IN ('a', 'b')
;

Note: To have the effect of adding NULL to the list, set the replacement value to something in the list.

Specific AND/OR condition for NULL

Similar to the COALESCE in its’ simplicity is being explicit in handling the NULL values using IS NULL and IS NOT NULL. This does extend the amount of code, but it’s also very clear on what the extra logic accomplishes. I’d recommend this for a small/medium term solution.

SELECT *
  FROM #ValueList
 WHERE(   val_1 NOT IN ('a', 'b')
       OR val_1 IS NULL
      )
;

Note: To have the effect of adding NULL to the list, toggle the second condition as required.

Lookup List

Storing the values in a table is the ideal long term solution. It allows for the values/conditions to be reused in multiple places, and any additions can be applied with just one update. It also has consistent behaviour with NULLs:

--------------------------------------------------------------------------------
-- Create a lookup list
--------------------------------------------------------------------------------
CREATE TABLE #LookupList
(
    val_1 VARCHAR(10) NOT NULL PRIMARY KEY
);

INSERT
  INTO #LookupList
      (val_1)
VALUES('a'),
      ('b')
;
--------------------------------------------------------------------------------
-- Filter using WHERE NOT EXISTS
--------------------------------------------------------------------------------
SELECT *
  FROM #ValueList vl
 WHERE NOT EXISTS
      (SELECT *
         FROM #LookupList ll
        WHERE ll.val_1 = vl.val_1
      )
;

--------------------------------------------------------------------------------
-- Filter using WHERE EXISTS
--------------------------------------------------------------------------------
SELECT *
  FROM #ValueList vl
 WHERE EXISTS
      (SELECT *
         FROM #LookupList ll
        WHERE ll.val_1 = vl.val_1
      )
;

Note: To have the effect of adding NULL to the list, it will need to be handled in a COALESCE or Specific condition for NULL.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.