Thursday, April 13, 2006

Internal storage of the DateTime datatype

There are many different opinions on how the DateTime value is stored internally in SQL Server. Some people say that it is converted to UTC, some think that some other magic stuff happens but none of this is true.

When you look at the books online this is the explanation:

"Values with the datetime data type are stored internally by the Microsoft
SQL Server 2005 Database Engine as two 4-byte integers. The first 4 bytes store
the number of days before or after the base date: January 1, 1900. The base date
is the system reference date. The other 4 bytes store the time of day
represented as the number of milliseconds after midnight."

It may be hard to believe but this is exactly what SQL Server does. You send him a date and time and he calculates the number of days since 1900-01-01 and the number of ticks since 00:00 for that date. SQL Server does NOT care about timezone and daylight savings time. If you really need to take these into account I suggest you save the offset between the UTC date and the local date and the UTC date. I give preference to take both the times from the client so you don't create problems because of time differences between the server and the client (unless you can guarantee in-sync clients).

Here is a little script that demonstrates the internal storage:

DECLARE @theDate datetime
SET @theDate = '2006-04-14 14:00'

SELECT CAST(SUBSTRING(CAST(@theDate AS varbinary), 1, 4) AS int)

SELECT CAST(SUBSTRING(CAST(@theDate AS varbinary), 5, 4) AS int)
SELECT CONVERT(char(10),DATEADD(d, 38819, '1900-01-01'), 120) AS 'theDate'
SELECT CONVERT(char(10), DATEADD(s, (15120000 / 300), '00:00'), 108) AS 'theTime'

2 comments:

Gabriel Lozano-MorĂ¡n said...

Just a question on a side note, I am trying to find some performance comparison tests for datetime vs smalldatetime datat types, have you tried it once? I have seen a lot of DBA's create datetime fields where no smaller precision than to the minute is needed so I am wondering what the exact overhead is not only on paging level but also cpu overhead, memory allocations ...

WesleyB said...

Gabriel,

I don't think there will be a big difference in this but I also like to prove my statements. I'll conduct some tests asap. I'm doggy-sitting @ my brothers house while he is enjoying the Spanish weather :( As soon as I can reach a SQL Server I'll let you know ;-)

A datetime is 8 bytes as opposed to 4 bytes for a smalldatetime so I would definitely go for smalldatetime when it meets the precision requirements you need. I always go for the narrowest possible datatype because obviously having more rows on a page is great in terms of performance and memory usage. My guess is that the performance difference between 4-bytes and 8-bytes won't be noticeable from a CPU/memory point of view except for the fact that you might fit less rows on a page increasing the number of pages to read and buffer.