Partitions in SQL Server: Partitioning a Table on a non Primary Key Column

Partitions in SQL Server: Partitioning a Table on a non Primary Key Column

In my series on partitioning, I’ve looked at creating a fact table partitioned by date. But I recently came across a case where I’d got data which had a date which should be used for the partition, however there was also a natural key in the data which should be maintained as a primary key.

Components of a Partitioned Table

Just to recap (and for those who have landed here after a google search). The key components which make up a partitioned table in SQL Server are:

  1. Partition Function: The definition of how data is to be split. It includes the data type, and the value ranges to use in each partition.
  2. Partition Scheme: The definition of how a partition function is to be applied to data files. This allows DBAs to split data across logical storage locations if required, however in most modern environments with large SANs most SQL Server implementations and their DBAs will just use ‘primary’.
  3. Table & Index Definition: The last part is the definition of the table, including the partition scheme to use, and the index with which to apply it.

Creating The Partitioned Table

For the benefit of those who have landed here by google, and those of us who like to copy/paste), here is the function and scheme sample first.

Partition Function

For this dataset, I’m going to add a new partition for each month (functions can be for any time frame you chose):

--------------------------------------------------------------------------------
-- Partition Function
-- Using RIGHT (so the value given is the LOWER value)
--------------------------------------------------------------------------------
CREATE PARTITION FUNCTION [FactTable_PF]
      (DATE) -- Datatype to use
AS
RANGE RIGHT  -- See previous post
 FOR VALUES  -- Values on which to 'split' the data
(
    N'2017-01-01T00:00:00', -- This will contain all data for 2017 Jan onwards (until the next partition). All data before then goes into a 'catch all' partition
    N'2018-01-01T00:00:00', -- This partition will contain all data from Jan 2018 onwards (until the next partition).
    N'2019-01-01T00:00:00'  -- This partition will contain all data from Feb 2018 onwards (unless a subsequent partition is added in the future)
);

Partition Scheme

Next, we create the scheme to tell SQL Server how to organise the data on the disk(s):

--------------------------------------------------------------------------------
-- Partition Scheme
--------------------------------------------------------------------------------
CREATE PARTITION SCHEME [FactTable_PS]
    AS PARTITION [FactTable_PF]
ALL TO ([primary])
;

Partitioned Table Definition

Like with partitioning an existing table, we use a separate CLUSTERED INDEX definition to apply the scheme and function. What is different here, is the application of a primary key which is explicitly UNCLUSTERED.

--------------------------------------------------------------------------------
-- Create Table using its own PK:
-- NONCLUSTERED is important, without it the data remains unpartitioned and only the index is partitioned
--------------------------------------------------------------------------------
CREATE TABLE [dbo].[FactTable_partitioned_different_pk]
(
    [surrogate_key] BIGINT IDENTITY(1,1) PRIMARY KEY NONCLUSTERED,
    [Date]      DATE,
    [attribute] VARCHAR(100),
    [metric]    BIGINT
)
;

CREATE CLUSTERED
 INDEX DK_FactTable_partitioned_different_pk
    ON [dbo].[FactTable_partitioned_different_pk]
      ([Date])
    ON [FactTable_PS] ([Date])
;

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.