Wednesday, August 09, 2006

Rebuilding an index and full scan of statistics

I recently saw a post on a forum discussing update stats and when to trigger it. Well first of all SQL Server triggers the update if auto stats is enabled when approximately 20% of the rows have changed. There is a difference between SQL Server 2000 and SQL Server 2005 however.

SQL Server 2000 uses a counter to track row modification whereas SQL Server 2005 uses a counter that tracks changes on column level. The biggest difference here is that updates are rated different when updating a key column. A non-key column update just raises the counter with the number of updated columns whereas a key column update raises the counter with 2 for each column. Another difference is that TRUNCATE TABLE and BULK INSERT does not raise the counters in SQL Server 2000 but they are accounted for in SQL Server 2005.

Obviously sometimes you may want to disable auto stats because they can hinder your performance in an OLTP environment and some of your tables might have a good representative set of data that statistically speaking will not change too much during the day. SQL Server 2005 gives you the option to update statistics async so the query triggering the update will not be blocked while updating the statistics but nevertheless you might not even want this to happen. In that case my advice is to update your statistics before your rebuild your indexes, unless you specify a full scan anyway (tables that are less than 8MB are always fully scanned for stats updates) but on big databases it might not be a viable option.

Why do I say before you reindex? Because a reindex automatically triggers an update stats with a full scan for THAT index since it has to read all the data for it anyway (pretty intelligent no?). When you trigger an update stats with the default sample set those stats are overwritten again with 'less' accurate stats. But yet another difference in SQL Server 2005, when you use sp_updatestats only statistics that require an update are actually updated.

You can easily check this behavior by using DBCC SHOW_STATISTICS after you rebuild an index.