Dates In SQL Server: Creating Date Range Batches

Dates In SQL Server: Creating Date Range Batches

I thought I’d covered everything I could cover on dates in SQL Server in my previous series. But a new challenge raised something which wasn’t already in there.

I found a scenario where I wanted to identify groups of consecutive dates over a larger date range. There was a process which took a start date and an end date and would process all data in that range. I wanted to go over a lot of history, but only reprocess the dates which had been impacted. I could identify the dates which needed reprocessing, many of which were sequential. But there were also numerous days and groups of days which didn’t need to be re-processed.

The solution was to identify the consecutive date ranges and put them into a cursor, with a start date and end date for each row. Solitary dates would be represented with one record having the same start and end date.

In the below worked solution, I’ve spread things out a bit to make reading and understanding a little easier. I believe easy to read and debug code is often better than highly compact code.

Sample Date Data

While I had a list of dates which needed re-processing, for this example I’ll use dates created using my number sequence method. And identify weekdays as the grouping.

--------------------------------------------------------------------------------
-- Create Date Table
--------------------------------------------------------------------------------
IF OBJECT_ID('tempdb..#date_t') IS NOT NULL
    DROP TABLE #date_t;

IF OBJECT_ID('tempdb..#ProcessingBatches') IS NOT NULL
    DROP TABLE #ProcessingBatches;

CREATE TABLE #date_t
(
  [TheDate]   DATE NOT NULL PRIMARY KEY,
  [IsWeekend] INT  NOT NULL,
  [IsWeekday] INT  NOT NULL,
);

--------------------------------------------------------------------------------
-- Create Date list
-- Using Number sequence: https://datablog.roman-halliday.com/index.php/2018/06/01/dates-in-sql-server-create-sample-date-ranges/
--------------------------------------------------------------------------------

-- Set the start of the week to 'Monday'
-- In this example, I'm batching things by 'working' week days
SET DATEFIRST 1;

DECLARE @DateStart DATE = '01-JAN-19';
DECLARE @DateEnd   DATE = EOMONTH('01-MAR-19');
PRINT 'Start Date: ' + CONVERT(VARCHAR(10), @DateStart, 120);
PRINT 'Start End: '  + CONVERT(VARCHAR(10), @DateEnd,   120);

WITH number_tbl AS (
SELECT 0 AS a UNION ALL
SELECT 1 UNION ALL
SELECT 2 UNION ALL
SELECT 3 UNION ALL
SELECT 4 UNION ALL
SELECT 5 UNION ALL
SELECT 6 UNION ALL
SELECT 7 UNION ALL
SELECT 8 UNION ALL
SELECT 9
), number_sequences AS (
SELECT t1.a AS t1a,
       t2.a AS t2a,
       t3.a AS t3a,
       (t3.a + (10 * t2.a) + (100 * t1.a)) AS concatenate_calculation
  FROM number_tbl t1
  CROSS JOIN number_tbl t2
  CROSS JOIN number_tbl t3
), date_sequence AS (
SELECT CAST(DateAdd(DAY, concatenate_calculation, @DateStart) AS DATE) AS [TheDate]
  FROM number_sequences
 WHERE CAST(DateAdd(DAY, concatenate_calculation, @DateStart) AS DATE) <= @DateEnd
)
INSERT
  INTO #date_t
      ([TheDate], [IsWeekend], [IsWeekday])
SELECT [TheDate],
       CASE WHEN DatePart(dw, [TheDate]) IN (6, 7) THEN 1 ELSE 0 END AS [IsWeekend],
       CASE WHEN DatePart(dw, [TheDate]) IN (6, 7) THEN 0 ELSE 1 END AS [IsWeekday]
  FROM date_sequence
;

SELECT * FROM #date_t;

GO

Put Processing Days Into Batches

Line 16 is where the magic happens. The difference in number of days between the row number and the given date. Consecutive dates will have the same difference. But if there is a jump in date, the difference will also jump.

----------------------------------------------------------------------------
-- Put unprocessed days into batches
----------------------------------------------------------------------------
WITH ds AS (
SELECT rcc.*,
       ROW_NUMBER() OVER (ORDER BY rcc.[TheDate]) AS [row_num]
  FROM #date_t rcc
 WHERE 1=1
   AND [IsWeekday] = 1 -- Whatever our filter criteria is for each batch
)
SELECT DateDiff(DAY, [row_num], [TheDate]) AS [DiffDays],
       MIN([TheDate])                      AS [batch_start_date],
       MAX([TheDate])                      AS [batch_end_date]
  INTO #ProcessingBatches
  FROM ds
 GROUP BY DateDiff(DAY, [row_num], [TheDate]) -- This is our calculation of if dates are consecutive
;

CREATE UNIQUE
 INDEX PK_ProcessingBatches
    ON #ProcessingBatches
      ([DiffDays])
;

SELECT * FROM #ProcessingBatches;

Create & Iterate Cursor

Declare a cursor and iterate over it in a loop, processing the dates in the batches as identified above.

--------------------------------------------------------------------------------
-- Craete Cursor
--------------------------------------------------------------------------------
DECLARE processing_date_cursor CURSOR FOR
SELECT [DiffDays], [batch_start_date], [batch_end_date]
  FROM #ProcessingBatches
;

--------------------------------------------------------------------------------
-- Iterate Cursor
--------------------------------------------------------------------------------
DECLARE @DiffDays  BIGINT;
DECLARE @DateStart DATE;
DECLARE @DateEnd   DATE;

OPEN processing_date_cursor;
FETCH NEXT FROM processing_date_cursor INTO @DiffDays, @DateStart, @DateEnd;
IF @@FETCH_STATUS <> 0
    PRINT 'Empty Cursor'
WHILE @@FETCH_STATUS = 0
BEGIN
    PRINT CONVERT(VARCHAR, GetDate(), 120) + ' - Executing Range: ' + CONVERT(VARCHAR(20), @DateStart, 120) + ' --> ' + CONVERT(VARCHAR(20), @DateEnd, 120) ;
    ----------------------------------------------------------------------------
    -- Process Start
    ----------------------------------------------------------------------------
    PRINT 'START: ' + CONVERT(VARCHAR(10), @DateStart, 120);
    PRINT '  END: ' + CONVERT(VARCHAR(10),   @DateEnd, 120);

    ----------------------------------------------------------------------------
    -- It is always possible to chunk the date ranges up smaller with the
    -- techiniques shown here:
    -- 'Dynamic batches of dates'
    -- https://datablog.roman-halliday.com/index.php/2018/06/09/dates-in-sql-server-iteration-of-date-ranges/
    ----------------------------------------------------------------------------
    
    /*
    EXECUTE [DB].[Schema].[stored_procedure]  
        @StartDateVariable = @DateStart,
        @EndDateVariable   = @DateEnd
    ;
    */

    ----------------------------------------------------------------------------
    -- Process End
    ----------------------------------------------------------------------------
    FETCH NEXT FROM processing_date_cursor INTO @DiffDays, @DateStart, @DateEnd;
END

--------------------------------------------------------------------------------
-- Close Cursor
--------------------------------------------------------------------------------
CLOSE processing_date_cursor;
DEALLOCATE processing_date_cursor;

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.