To ensure that only the latest version of a record is used, they’ve then had to add one of these two columns to the primary key of the table. Ok, so this appears to be a no brainer. Because the EffectiveTo date won’t be set for the latest record version, and nullable columns can’t be added to the primary key, the obvious choice is to make the EffectiveFrom date part of the primary key.
But this creates a problem.
You see, if you’re nearly always using the latest version of the record, then your query will be something like:
SELECT * FROM MyTable WHERE KeyField1=Val1 AND KeyField2=Val2 AND …blah… AND EffectiveTo IS NULL
And this is not optimal, because while it will retrieve the record from the table, it needs to do a retrieval of all versions of the record before it can filter down to just the record that you want.
Ok, so lets see what we can do about it. Say we decide that in future, we won’t allow the EffectiveTo date to be nullable. Instead of null, we will set the EffectiveTo date to ‘2999-12-31’.
Now the column can be used in the primary key, and the record will be selected using a clustered index seek, provided, of course, all the previous primary key columns are specified in the query to. So the above query becomes:
SELECT * FROM MyTable WHERE KeyField1=Val1 AND KeyField2=Val2 AND …blah… AND EffectiveTo = ‘2999-12-31’
The problem here is that we are using magic numbers and unless there are exceptional circumstances, magic numbers are bad.
But how would we do it otherwise? Easy, add an IsActive bit field to the key before the EffectiveFrom field. That would would work.The EffectiveFrom field is still needed to ensure uniqueness in versioning (a bit field only has 0 or 1, remember?)
So an IsActive field is a much the better solution.
But wait, there’s more!
Because the IsActive column is now part of the primary key, you can use it to partition the table. You create a partitioning function to split the data, as follows:
create partition function myPartition(bit)
as range right for values(1)
Then you create a partioning scheme so that the data can be divided between filegroups. Here, I put current data on the Primary filegroup, and I’ve created another called OlderData for the archived but referencable data.
create partition scheme myScheme
as Partition myPartition
And then recreate your table to partition across the filegroups using this scheme.
CREATE TABLE MyTable
(KeyField1 varchar(10) not null,
KeyField2 varchar(10) not null,
EffectiveFrom datetime not null,
EffectiveTo datetime not null,
IsActive bit not null,
PRIMARY KEY (KeyField1, KeyField2, EffectiveFrom, IsActive ) )
ON myScheme (IsActive) ;
Consider the impact if you have EffectiveFrom and EffectiveTo columns on almost every table in your database. Now your queries should be much more efficient, as the current data is all on the primary partition/filegroup and the archive data is all on the OlderData partition/filegroup.
When you’re mainly using the latest data, your queries will spend the majority of their time in the Primary partition, which is a much smaller subset of the overall data
Test it out yourself – but don’t forget to click on “Include Actual Execution Plan” in the Query menu of Sql Server Management Studio so that you can see if it makes a difference to the queries you create.