Dates In SQL Server: Creating Date Range Batches
I thought I’d covered everything I could cover on dates in SQL Server in my previous series. But a new challenge raised something which wasn’t already in there.
I found a scenario where I wanted to identify groups of consecutive dates over a larger date range. There was a process which took a start date and an end date and would process all data in that range. I wanted to go over a lot of history, but only reprocess the dates which had been impacted. I could identify the dates which needed reprocessing, many of which were sequential. But there were also numerous days and groups of days which didn’t need to be re-processed.
The solution was to identify the consecutive date ranges and put them into a cursor, with a start date and end date for each row. Solitary dates would be represented with one record having the same start and end date.
In the below worked solution, I’ve spread things out a bit to make reading and understanding a little easier. I believe easy to read and debug code is often better than highly compact code.
Sample Date Data
While I had a list of dates which needed re-processing, for this example I’ll use dates created using my number sequence method. And identify weekdays as the grouping.
-------------------------------------------------------------------------------- -- Create Date Table -------------------------------------------------------------------------------- IF OBJECT_ID('tempdb..#date_t') IS NOT NULL DROP TABLE #date_t; IF OBJECT_ID('tempdb..#ProcessingBatches') IS NOT NULL DROP TABLE #ProcessingBatches; CREATE TABLE #date_t ( [TheDate] DATE NOT NULL PRIMARY KEY, [IsWeekend] INT NOT NULL, [IsWeekday] INT NOT NULL, ); -------------------------------------------------------------------------------- -- Create Date list -- Using Number sequence: https://datablog.roman-halliday.com/index.php/2018/06/01/dates-in-sql-server-create-sample-date-ranges/ -------------------------------------------------------------------------------- -- Set the start of the week to 'Monday' -- In this example, I'm batching things by 'working' week days SET DATEFIRST 1; DECLARE @DateStart DATE = '01-JAN-19'; DECLARE @DateEnd DATE = EOMONTH('01-MAR-19'); PRINT 'Start Date: ' + CONVERT(VARCHAR(10), @DateStart, 120); PRINT 'Start End: ' + CONVERT(VARCHAR(10), @DateEnd, 120); WITH number_tbl AS ( SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9 ), number_sequences AS ( SELECT t1.a AS t1a, t2.a AS t2a, t3.a AS t3a, (t3.a + (10 * t2.a) + (100 * t1.a)) AS concatenate_calculation FROM number_tbl t1 CROSS JOIN number_tbl t2 CROSS JOIN number_tbl t3 ), date_sequence AS ( SELECT CAST(DateAdd(DAY, concatenate_calculation, @DateStart) AS DATE) AS [TheDate] FROM number_sequences WHERE CAST(DateAdd(DAY, concatenate_calculation, @DateStart) AS DATE) <= @DateEnd ) INSERT INTO #date_t ([TheDate], [IsWeekend], [IsWeekday]) SELECT [TheDate], CASE WHEN DatePart(dw, [TheDate]) IN (6, 7) THEN 1 ELSE 0 END AS [IsWeekend], CASE WHEN DatePart(dw, [TheDate]) IN (6, 7) THEN 0 ELSE 1 END AS [IsWeekday] FROM date_sequence ; SELECT * FROM #date_t; GO
Put Processing Days Into Batches
Line 16 is where the magic happens. The difference in number of days between the row number and the given date. Consecutive dates will have the same difference. But if there is a jump in date, the difference will also jump.
---------------------------------------------------------------------------- -- Put unprocessed days into batches ---------------------------------------------------------------------------- WITH ds AS ( SELECT rcc.*, ROW_NUMBER() OVER (ORDER BY rcc.[TheDate]) AS [row_num] FROM #date_t rcc WHERE 1=1 AND [IsWeekday] = 1 -- Whatever our filter criteria is for each batch ) SELECT DateDiff(DAY, [row_num], [TheDate]) AS [DiffDays], MIN([TheDate]) AS [batch_start_date], MAX([TheDate]) AS [batch_end_date] INTO #ProcessingBatches FROM ds GROUP BY DateDiff(DAY, [row_num], [TheDate]) -- This is our calculation of if dates are consecutive ; CREATE UNIQUE INDEX PK_ProcessingBatches ON #ProcessingBatches ([DiffDays]) ; SELECT * FROM #ProcessingBatches;
Create & Iterate Cursor
Declare a cursor and iterate over it in a loop, processing the dates in the batches as identified above.
-------------------------------------------------------------------------------- -- Craete Cursor -------------------------------------------------------------------------------- DECLARE processing_date_cursor CURSOR FOR SELECT [DiffDays], [batch_start_date], [batch_end_date] FROM #ProcessingBatches ; -------------------------------------------------------------------------------- -- Iterate Cursor -------------------------------------------------------------------------------- DECLARE @DiffDays BIGINT; DECLARE @DateStart DATE; DECLARE @DateEnd DATE; OPEN processing_date_cursor; FETCH NEXT FROM processing_date_cursor INTO @DiffDays, @DateStart, @DateEnd; IF @@FETCH_STATUS <> 0 PRINT 'Empty Cursor' WHILE @@FETCH_STATUS = 0 BEGIN PRINT CONVERT(VARCHAR, GetDate(), 120) + ' - Executing Range: ' + CONVERT(VARCHAR(20), @DateStart, 120) + ' --> ' + CONVERT(VARCHAR(20), @DateEnd, 120) ; ---------------------------------------------------------------------------- -- Process Start ---------------------------------------------------------------------------- PRINT 'START: ' + CONVERT(VARCHAR(10), @DateStart, 120); PRINT ' END: ' + CONVERT(VARCHAR(10), @DateEnd, 120); ---------------------------------------------------------------------------- -- It is always possible to chunk the date ranges up smaller with the -- techiniques shown here: -- 'Dynamic batches of dates' -- https://datablog.roman-halliday.com/index.php/2018/06/09/dates-in-sql-server-iteration-of-date-ranges/ ---------------------------------------------------------------------------- /* EXECUTE [DB].[Schema].[stored_procedure] @StartDateVariable = @DateStart, @EndDateVariable = @DateEnd ; */ ---------------------------------------------------------------------------- -- Process End ---------------------------------------------------------------------------- FETCH NEXT FROM processing_date_cursor INTO @DiffDays, @DateStart, @DateEnd; END -------------------------------------------------------------------------------- -- Close Cursor -------------------------------------------------------------------------------- CLOSE processing_date_cursor; DEALLOCATE processing_date_cursor;