NULL: When a value is neither IN or NOT IN a list.
NULL is always a challenge for equality, I’ve written about it before. I recently produced some samples for training/explanation of NULL when it comes to IN and NOT IN, and thought I’d include it here.
In the past I have looked at NULL from the perspectives of:
- Managing update differences NULL and NOT NULL
- Warning NULL value is eliminated by an aggregate or other set operation
NULL and Equality
The key thing to remember is that NULL is neither equal to, or not equal to NULL. As shown by the below SQL:
-------------------------------------------------------------------------------- -- Classic equality (without NULL) -------------------------------------------------------------------------------- SELECT 1 AS output_craeted WHERE 1 = 1 ; -------------------------------------------------------------------------------- -- NULL is neither equal to, nor NOT equal to NULL -- It IS either NULL or NOT NULL -------------------------------------------------------------------------------- SELECT 2 AS output_craeted WHERE NULL = NULL ; SELECT 2 AS output_craeted WHERE NULL <> NULL ; SELECT 3 AS output_craeted WHERE NULL IS NOT NULL ; SELECT 4 AS output_craeted WHERE NULL IS NULL ;
Example of IN and NOT IN
When using IN and NOT IN, the SQL underneath is creating a collection of equality tests, all of which NULL will fail.
Below are some working examples of using NULL with IN and NOT IN lists, I’m going to work with the same small set of sample data throughout the examples here.
Create sample data
-------------------------------------------------------------------------------- -- Create source data -------------------------------------------------------------------------------- CREATE TABLE #ValueList ( ID INT IDENTITY(1,1) NOT NULL PRIMARY KEY, val_1 VARCHAR(10), metric INT ); INSERT INTO #ValueList (val_1,metric) VALUES('a', 10), ('b', 20), ('c', 30), (NULL, 40) ; -- View our sample data SELECT * FROM #ValueList ;
Try using IN and NOT IN
-------------------------------------------------------------------------------- -- Hard coded list: NOT IN -------------------------------------------------------------------------------- SELECT * FROM #ValueList WHERE val_1 NOT IN ('a', 'b') -- Skips NULL ; SELECT * FROM #ValueList WHERE val_1 NOT IN ('a', 'b', NULL) -- Returns no results ; -------------------------------------------------------------------------------- -- Hard coded list: IN -------------------------------------------------------------------------------- SELECT * FROM #ValueList WHERE val_1 IN ('a', 'b') -- Skips NULL (as expected) ; SELECT * FROM #ValueList WHERE val_1 IN ('a', 'b', NULL) -- Skips NULL (not as expected) ;
This can be seen in action over at rextester.com
Solutions
The aim is to be consistent on treating NULL as a distinct value, which is either IN or NOT IN the given list. It is key to note that in neither solution can NULL be in the list.
The solutions can also be seen in action over at rextester.com
COALESCE
If it’s just a quick query, then the simplest solution is to use COALESCE, this will allow NULL values to be set to something specifically IN or NOT IN the given list to control behaviour:
SELECT * FROM #ValueList WHERE COALESCE(val_1, 'Z') NOT IN ('a', 'b') ;
Note: To have the effect of adding NULL to the list, set the replacement value to something in the list.
Specific AND/OR condition for NULL
Similar to the COALESCE in its’ simplicity is being explicit in handling the NULL values using IS NULL and IS NOT NULL. This does extend the amount of code, but it’s also very clear on what the extra logic accomplishes. I’d recommend this for a small/medium term solution.
SELECT * FROM #ValueList WHERE( val_1 NOT IN ('a', 'b') OR val_1 IS NULL ) ;
Note: To have the effect of adding NULL to the list, toggle the second condition as required.
Lookup List
Storing the values in a table is the ideal long term solution. It allows for the values/conditions to be reused in multiple places, and any additions can be applied with just one update. It also has consistent behaviour with NULLs:
-------------------------------------------------------------------------------- -- Create a lookup list -------------------------------------------------------------------------------- CREATE TABLE #LookupList ( val_1 VARCHAR(10) NOT NULL PRIMARY KEY ); INSERT INTO #LookupList (val_1) VALUES('a'), ('b') ;
-------------------------------------------------------------------------------- -- Filter using WHERE NOT EXISTS -------------------------------------------------------------------------------- SELECT * FROM #ValueList vl WHERE NOT EXISTS (SELECT * FROM #LookupList ll WHERE ll.val_1 = vl.val_1 ) ; -------------------------------------------------------------------------------- -- Filter using WHERE EXISTS -------------------------------------------------------------------------------- SELECT * FROM #ValueList vl WHERE EXISTS (SELECT * FROM #LookupList ll WHERE ll.val_1 = vl.val_1 ) ;
Note: To have the effect of adding NULL to the list, it will need to be handled in a COALESCE or Specific condition for NULL.