Dates In SQL Server: Public Holidays
For most business reporting, public holidays (or other special days) cary much business impact. Here are some ideas on solutions.
It usually doesn’t take long with a data warehouse project before we want to enhance our calendar. Holidays aren’t built into DBMS systems by default, so we have to create our own solution.
I feel like I have only scratched the surface here, and there is a lot more to cover at some point in the future.
Key design questions
Before we build anything, we have to think about these aspects which heavily impact our architectural decisions:
- What we count as a holiday? When banks and offices are closed. Or perhaps, the 750 greek names celebrated on various days throughout the orthodox calendar.
- Are we tracking different types of holiday? A combination of when businesses are closed such as Christmas, and when shopping patterns might spike like Valentines day.
- Do we need to take into account date ranges? Periods like Lent and Ramadan cover numerous consecutive days.
- How many regions are we including? Do we only need one region (country) of holidays, or do we need an international solution.
For the UK, the government publishes a handy list of dates which will do for most business reporting: https://www.gov.uk/bank-holidays
The solution(s)
The basic date table
Regardless of which holidays we store, I always recommend having a date/calendar table in a data warehouse.
The simple and most common case is dealing with holidays in one country, there will probably be a requirement to track all the days when banks/offices are closed for tracking banking/finance movements. We can also include dates which might impact the business (but which aren’t public holidays) such as valentines day, mothers day or sporting events.
The date information is stored in a basic table:
-------------------------------------------------------------------------------- -- Create Date Table -------------------------------------------------------------------------------- CREATE TABLE date_t ( [TheDate] DATE NOT NULL PRIMARY KEY, [IsWeekend] INT NOT NULL, [IsWeekday] INT NOT NULL, [IsHoliday] INT NOT NULL DEFAULT 0, [IsBankHoliday] INT NOT NULL DEFAULT 0, -- If we want to store non Public holidays [HolidayDescription] VARCHAR(100) NULL );
In the above example, I’ve included two columns to track holidays.
- IsHoliday – valentines, mothers day etc…
- IsBankHoliday – Christmas, Easter etc…
Creating and managing the dates
Working example is: SQL Server – UK Holiday Calendar Example
I cheated a little for this article and:
- Found a CSV file with UK bank holiday dates which I used as a base.
- Copied the CSV data into a spreadsheet tool.
- Used the spreadsheet to create the UPDATE statements.
- Pasted the created update statements into rextester.
The spreadsheet bits
Using LibreOffice Calc, I split text to columns to format the data.
I then filtered to bank holidays by adding the autofilter.
Replaced all single quotes with two (this escapes them in SQL Server).
I built the formula to create the statements as:
= "UPDATE date_t SET [IsHoliday] = 1, [HolidayDescription] = '" & A2 & "' WHERE [TheDate] = CONVERT(DATE, '" & TEXT(C2, "yyyy-mm-dd") & "', 120);"
Multiple holidays
If we want to track a case where multiple holidays/events can occur on any given day, then the simplest thing to do is split the events out into a separate table, but keep a date table.
Views can then communicate (simplifying dangers of duplicating rows) if it’s a public holiday.
-------------------------------------------------------------------------------- -- Create Date Table -------------------------------------------------------------------------------- CREATE TABLE date_t ( [TheDate] DATE NOT NULL PRIMARY KEY, [IsWeekend] INT NOT NULL, [IsWeekday] INT NOT NULL ); -------------------------------------------------------------------------------- -- Create Holiday Table -------------------------------------------------------------------------------- CREATE TABLE holiday_t ( [HolidayID] BIGINT NOT NULL IDENTITY(1,1) PRIMARY KEY, [HolidayDate] DATE NOT NULL, [HolidayIsBankHoliday] INT NOT NULL, [HolidayDescription] VARCHAR(100) NULL, FOREIGN KEY ([HolidayDate]) REFERENCES date_t([TheDate]) );
See example on rextester: SQL Server – Multiple holidays Model
The international date table
Extending our model to work internationally is the same as extending to handle multiple holidays. The simple solution is to just add an extra column to denote country/region.
-------------------------------------------------------------------------------- -- Create Holiday Table (with Country) -------------------------------------------------------------------------------- CREATE TABLE holiday_t ( [HolidayID] BIGINT NOT NULL IDENTITY(1,1) PRIMARY KEY, [HolidayCountryCode] VARCHAR(3) NOT NULL, [HolidayDate] DATE NOT NULL, [HolidayIsBankHoliday] INT NOT NULL, [HolidayDescription] VARCHAR(100) NULL, FOREIGN KEY ([HolidayDate]) REFERENCES date_t([TheDate]) );
Database theory and the rules of normalisation might lead one to consider having an extra table of countries, and an extra holidays in countries table to link the two, this way we can apply Christmas day across most of the western world, and a period of ramadan across much of the east. I’d personally recommend against such an implementation, very few organisations will require this level of calendar interaction and supporting such a model will become a full time job in itself.