Tuesday, October 30, 2007

SQL Server 2005 SP2 Cumulative Hotfix Package 5 - Announced

We are in the process of gradually upgrading our servers to CU4 now (build 3200) and here is the next announcement.

I know many people think they are messing about with SQL Server 2005 and that it has more bugs than other products but this is not the case.
Releasing this many hotfixes is all part of a new strategy of the "incremental servicing model" which is explained here.

So do not panic, updates like this will be released every 2 months.

Saturday, October 27, 2007

PowerShell

Many UNIX people will probably say "boring" when they see PowerShell. UNIX has always supported great flexibility in scripting, of course they did not have a great GUI like Windows does. But as server management is getting more and more complex because of the large environments and many flavors of server software it was time to start thinking without the GUI because it does tend to work a bit slower as compared to a script enabled server and some things are just impossible using only a GUI. With scripts you can check for certain conditions and take actions automatically and appropriately based on these conditions.

The answer to this is PowerShell, a command line shell and scripting environment which will enable you to manage almost everything. One of its most powerful features is the ability to extend it with .NET components as well as being able to use the out of the box .NET Framework objects.

For Exchange 2007 they have first written the PowerShell layer and based the GUI entirely on these scripts. The GUI will also support a Script This Action (sound like we were ahead in SQL Server) which will create the PowerShell script for you.

At the customer we have something we call DCS (Data Conversion Scripts), basically it means going from one version of the database to the next version. We had a great .NET console application combined with some batch files which does this for us. Although this worked great we felt this was getting too complex to manage because we do parallel development and support different version at the same time. A new system was required using a simpler concept which my colleague killspid had proven was possible.

We started out with a batch file which used sql scripts with SQLCMD to perform the DCS task. Although this worked fine the lead DBA had a new requirement which needed more logic and was hard to do in a batch file or T-SQL. We decided to rewrite the DCS in PowerShell and I am very glad we did. Using a PowerShell script, SQLCMD and an XML config file we now support all the requirements and we have much more control than we did in the previous versions. The support for variables in SQLCMD has also proven itself very useful so do not underestimate the power of SQLCMD either.

I must admit the syntax and way of working is a bit peculiar at first because you have to lose the .NET mindset and get into the scripting mindset. If you are used to creating administrative scripts you will probably have less of a problem.

Tip: if you want to get rid of the output SQLCMD gives you in PowerShell add >$null behind it.
eg. sqlcmd -SmyServer -dmyDatabase -E -b > $null

Tuesday, October 23, 2007

Table Valued Parameters

Before SQL Server 2008 the most common way to pass data from one procedure to another would be by creating a temporary table, fill it with the required data and then call another procedure that uses that temporary table. Although this is not a disaster it does have some flaws like not being strongly typed. The weak typing of temp tables in this scenario make it a perfect spot for issues. There are other solutions using XML but these tend to be more complex than necessary for the 'simple' thing you are trying to do... pass structured data around.

SQL Server 2008 now supports something called Table Valued Parameters (or TVP) which can help you in these situations. TVP's make it possible to use a "table" as a parameter for a procedure. A couple of limitations apply, TVP's can only be READONLY in the procedure that define them as a parameter and they can only be used as an input parameter. Apart from this the same rules apply to TVP's as to table variables for example no DDL can be executed against a TVP and no statistics are kept for TVP's.

A little example will make it very clear.

--Create test table
CREATE TABLE myUsers

(ID int, UserName varchar(50), UserRole tinyint);

GO

--Create the required type
CREATE TYPE UserRoleType AS TABLE
( UserRole tinyint );

GO

--Create procedure that takes the type as a parameter (READONLY is required)
CREATE PROCEDURE GetUsersInRole

@UserRoleType UserRoleType READONLY

AS

SELECT UserName
FROM myUsers u

INNER JOIN @UserRoleType ut ON u.UserRole = ut.UserRole

GO

--Insert some test data (multiple inserts in one go, another new feature)
INSERT INTO myUsers

VALUES (1, 'Wesley', 1),
(2, 'Tom', 2),
(3, 'Patrick', 2),
(4, 'Jan', 3),
(5, 'Bregt', 3)

--Throw in a new dmv to look at the type and check dependencies if you like
--SELECT * FROM sys.table_types

--SELECT * FROM sys.dm_sql_referenced_entities ('dbo.GetUsersInRole', 'OBJECT')

GO

--Lookup action
DECLARE @UserRoleType
AS UserRoleType;

--Lets use another new features (initialize var on declare!)
DECLARE @Admin tinyint = 1

DECLARE @PowerUser tinyint = 2

DECLARE @User tinyint = 3

--Add parameter values to the table valued parameter
--INSERT INTO @UserRoleType VALUES (1), (2)

INSERT INTO @UserRoleType VALUES (@Admin), (@PowerUser)

--Call stored procedure with specific type (remember the post is about table valued parameters)
EXEC GetUsersInRole @UserRoleType;

GO

--Clean up
DROP PROCEDURE GetUsersInRole

DROP TYPE UserRoleType

DROP TABLE myUsers

Tuesday, October 16, 2007

SQL Server 2005 SP2 Cumulative Hotfix Package 4 - Released

It is out there! We are at the nice round build 3200.
Same as with CU3 there is also no download available.

http://support.microsoft.com/default.aspx/kb/941450

Tuesday, October 02, 2007

Some I/O improvements in Windows Server 2008

Windows Server 2008 offers a lot of new features that are very visible like Internet Information Server 7.0, a whole new virtualization model, Server Manager and many many more. But being a SQL Server person I am interested in other things that have been improved. These not so visible features are mostly I/O related since databases are a synonym for I/O. Also note that many (if not all) of these changes also apply to Windows Vista.

I/O Completion Port improvement

Before Windows Server 2008 a thread that issued an async I/O also executed the I/O completion task causing a context switch which is expensive. The I/O completion is now deferred until the thread pulls the I/O off the completion port preventing this context switch.

I/O prioritization

This is completely new to the OS, not only do your processes have a priority but also the I/O's that are triggered by these processes have their priority. This priority is based on the thread priority but can be set on the I/O itself too.

I/O Performance improvements

Using the GetQueuedCompletionStatusEx API call enables Windows to retrieve multiple completion port entries in one call.

Prior to Windows Server 2008 the Memory Manager and I/O system limited every I/O request internally to 64KB, larger requests were divided into multiple 64KB parts. This limit has been removed so every request can now be issued as a whole, meaning less transitions to kernel-mode to send the I/O to your storage device.

NUMA improvements

When working with SQL Server 2005 and later you will most likely encounter more and more NUMA enabled machines. In Windows Server 2008 more memory allocations procedures have been updated to be NUMA aware and I/O interrupts direct their completion to the node that initiated the I/O. An addition to the NUMA APIs also allows applications to specify the preferred node.

SMB2

According to the people who designed SMB2 this should help performance of large file copies over the network with factor 30 to 40. This will not really help your SQL Server performance but it will certainly get your backups to other servers faster.

For more information about kernel changes in Windows Server 2008 check out this great webcast by Mark Russinovich.

Sunday, September 30, 2007

Majority Quorum Model

Windows Clustering Services have been around for some time and most if not all critical servers in large enterprise are probably running on this technology. An important part of a cluster is the quorom disk which holds vital information to keep your cluster up and running. The two options in Windows 2003 for this quorum are shared disk quorum model where all nodes share the same quorum and the majority node set model where each node has a replicated copy of the quorum. The shared disk model is used most often because a lot of the clusters consist of two nodes. Do note that there is an update for Windows Server 2003 that enables a "File Share Witness" to create a majority node set cluster with just 2 nodes (KB921181 - steps required).

In Windows Server 2008 they have merged these two models which is now called Majority Quorum Model. Before Windows Server 2008 the quorum disk was a single point of failure. The risk was obviously low because quorum drives are usually installed on redundant storage devices but even highly redundant storage may fail sometime. In Windows Server 2008 however the quorum disk or as it is now called witness disk is no longer a single point of failure. Clustering now uses a 'vote' system where each node and the quorum device can be assigned a vote. A cluster remains online if just a single vote is lost meaning your cluster will continue to work even when the quorum is lost.

More information about failover clustering in Windows Server 2008 can be found here:
http://www.microsoft.com/windowsserver2008/failover-clusters.mspx

Saturday, September 29, 2007

Windows Server 2008 RC0

You probably are all aware of this but since I am on a holiday I have a good excuse to be a bit later.

Microsoft has released Windows Server 2008 Release Candidate 0. This is great news since there are a lot of exciting features in Windows Server 2008 for you to play with. They have also included the long awaited Hypervisor platform for your virtualization needs.

Get this free download here.

Thursday, September 13, 2007

SQL Server 2005: Migration aftermath

I guess we can safely say that the migration was a success. It was quite a flat migration since we needed to guarantee a "backout" to SQL Server 2000, the only changes that were done is the replacement of SCSIPort with StorPort drivers (this was the only server left to migrate) and we decided to change the physical implementation of the database.

A couple of preliminary results since replication is still running and it obviously has impact on performance:

Long running transactions:
Before: 70.000/day > 100 msec - 1.000/day > 1000 msec
After: 14.000/day > 100 msec - 650/day > 1000 msec

Long running CUD transactions:
Before: 40.000/day > 100 msec - 250/day > 1000 msec
After: 4.000/day > 100 msec - 100/day > 1000 msec

Avg Disk Secs/Read:
Before: 12 msec
After: 9 msec

Avg Disk Secs/Write:
Before: 15 msec
After: 9 msec

Job duration dropped with over 50%.

Now it is time to start implementing new features like partitioning, service broker, vardecimal and hopefully many more. We have a lot of ideas floating around in our heads but unfortunately there are a couple of other priorities to take care of first Since we were no longer happy with our existing reindexing job we did decide to rewrite it from scratch and this gave us the opportunity to implement online reindexing.

I would like to thank all of my colleagues and the people from Microsoft who made this migration possible. I also apologize to all the colleagues I had to send away from my desk during the migration preparation ;-)

Friday, September 07, 2007

SQL Server 2005: Migration imminent and something about TF4618

So we are back at the point of no return. This Sunday we will be migrating our largest SQL Server 2000 to SQL Server 2005. During the first week we will use transactional replication to ensure a 'backout' in case of failure. Last year when we attempted to migrate this server we unfortunately hit the TokenAndPermUserStore issue and had to rollback the migration.

As explained in this post we created some tests script which were able to degrade the performance significantly. There were a couple of hotfixes for issues concerning the TokenAndPermUserStore cache with the latest being build 3179. We eventually did our tests on build 3186 which is the version we will be going to production with, unfortunately we were still able to degrade the performance with our scripts.

We looked at a couple of alternatives to prevent this decrease from happening and as far as we know there are 3 ways to prevent the TokenAndPermUserStore cache from growing too large and these are:
  • add the user(s) that access the database to the sysadmin role
  • create a scheduled job that issues DBCC FREESYSTEMCACHE ('TokenAndPermUserStore')
  • use TF4618

After we ran our tests with trace flag 4618 we found that the TokenAndPermUserStore cache remained very small (less than 1MB) and our response times were very stable. The catch with this trace flag is that it has to be enabled by adding -T4618 to the SQL Server startup parameters (using DBCC TRACEON does not help in this case). What this trace flag does is evict old entries when new entries get inserted to keep the cache small. Because this extra work might cause an increase in CPU usage be sure to carefully monitor this. We did not see any significant increase in our stress tests.

We also decided to take 1 more "risk" and that is a reorganization of our database file layout. A legacy layout because of the previous SAN was 4 files for data and 4 files for the non-clustered indexes. Since we noticed a great difference in response times of the luns between the 2 filegroups we now use 1 filegroup spread over the 8 luns, this resulted in a great balance over all luns and improved response times.

After many hours of reorganizing, stress testing, replication testing and going mad we are finally there. We had days of paranoia and days of euphoria and we ended with euphoria, hopefully it will remain this way. Monday will tell us if everything was as well as we suspected so I suppose the only thing that remains is to wish us luck :-)

Monday, August 20, 2007

SQL Server 2005 SP2 Cumulative Hotfix Package 4

Probably a bit early since the previous package is not yet available, but knowledge is power they say.

http://support.microsoft.com/kb/941450

Wednesday, August 08, 2007

SQL Server 2005 SP2 Cumulative Hotfix Package 3

Keep an eye out for KB939537 which will be the next cumulative hotfix package for SQL Server 2005 SP2. Unfortunately the fix itself is not yet available for download.

Any guesses on the build number? I'm going for 3193 :-)

*EDIT*
It is build 3186 so I was pretty close but too pessimistic ;-)

Wednesday, August 01, 2007

Some new features in SQL Server 2008 - July CTP

Date and time

As many of you already know we finally have a separate date and time datatype in SQL Server 2008. This is not the only new option for date and time however, there a couple of other new datatypes like datetime2 and datetimeoffset.

  • Date
    The data datatype takes 3 bytes and has a range from 0001-01-01 through 9999-12-31
  • Time
    The time datatype takes 3 to 5 bytes depending on the fractional second precision. Time has a range from 00:00:00.0000000 through 23:59:59.9999999. The default precision is 7 meaning hh:mm:ss.nnnnnnn and depending on the precision you specify - with time(x) where x is a number from 0 to 7 - the storage size differs . From 0 to 2 time will take 3 bytes, from 3 to 4 it will take 4 bytes and up to 7 it will take 5 bytes.
  • Datetime2
    Although this makes me shiver as it makes me think of varchar2 it is a great datatype. Datetime2 takes from 6 to 8 bytes and has a range of 0001-01-01 through 9999-12-31. It also implements the same fractional second precision option as the time datatype. From 0 to 2 datetime2 will take 6 bytes, from 3 to 4 it will take 7 bytes and up to 7 it will take 8 bytes.
  • Datetimeoffset
    The datetimeoffset datatype takes from 8 to 10 bytes and has the same properties as datetime2 but the difference with the other datatypes is that datetimeoffset is timezone aware. The fractional second precision again defines the storage size, from 0 to 2 datetimeoffset will take 8 bytes, from 3 to 4 it will take 9 bytes and up to 7 it will take 10 bytes. There are also a couple of datetimeoffset related functions like SWITCHOFFSET and TODATETIMEOFFSET.

HierarchyID

Another new datatype is the hierarchyid datatype which represents the position in a tree hierarchy. I did not have the time to play with it yet but from what I have read it implements 2 special indexing methods, depth-first and breadth-first. It also has a couple of related function like GetAncestor, GetDescendant, IsDescendant, ...

Object Dependencies

The new management views sys.sql_expression_dependencies, sys.dm_sql_referencing_entities, and sys.dm_sql_referenced_entities gives you a view of dependencies that is always up-to-date. It even allows you to see cross-server and cross-database dependencies.

Page corruption

The new management view sys.dm_db_mirroring_auto_page_repair contains information about automatic repair attempts on mirrored databases (page restores from the mirror database).

SQL Server Extended Events

Built in support for the ETW (Event Tracing for Windows) engine to help you trace, log and debug problems. For more information about using ETW with SQL Server check out this post.

I recommend you to play with SQL Server 2008 yourself since there is already a very long list of new things to learn in this CTP! There are still a couple of other features in the July CTP which I have not seen yet so I hope I will find some time to experiment with these too.

TokenAndPermUserStore continued

Dirk G. from MCS Belgium informed me about another hotfix for the TokenAndPermUserStore issue. Build 3179 contains another fix for this cache (the latest build is 3182 by the way). The official description is "The TokenAndPermUserStore cache store may continue to grow steadily and decrease performance".

Check out the hotfixes here. Unfortunately they are on-demand so you will have to contact Microsoft to obtain this build.

Tuesday, July 31, 2007

SQL Server 2008 - July CTP

It's play time :-)
A lot of people will be very happy... date and time is available as a seperate datatype!

More information here and download here.

Wednesday, July 25, 2007

MemToLeave Issue in SQL Server 2000 x86

This week we were getting "WARNING: Failed to reserve contiguous memory of Size= 131072." errors on a couple of our SQL Server 2000 machines. Since we had a lot of available memory we quickly suspected the cause to be VAS fragmentation but we needed to find the root cause and confirmation that is was indeed VAS fragmentation. This is a bit harder since SQL Server 2000 has a lot less options to investigate this as opposed to SQL Server 2005.

VAS (Virtual Address Space) is the 4 GB region (in 32-bit) where 2 GB is reserved for the kernel mode and the other 2 GB for user mode (unless you specify /3GB). Every memory allocation (including AWE mapping) for an application happens in the user mode portion of the VAS.

By default SQL Server reserves 256MB in the MemToLeave region and an additional part that is calculated as worker threads * stack size (0.5MB for SQL Server 2000 x86). This memory is reserved in the VAS and used for special memory allocations like extended stored procedures, sp_OA procedures, CLR integration and some specific memory managers. One of the properties of MemToLeave memory is that it is static and it can only be increased at startup by using the -g switch. If you wish to do this do it with great caution as VAS is only 2 GB or 3 GB on a 32 bit system. As a side note, allocations from this region are done by the Multi Page Allocator in SQL Server 2005, meaning it is not stolen from the buffer pool and consequently is not included in the Max Server Memory setting.

We contacted PSS to help us find the root cause and they instructed us to start SQL Server with -T3654 which allows you to run a command that returns information about the memory allocations SQL Server has done: DBCC MEMOBJLIST. We would have resolved the whole issue by restarting SQL Server and limit our chances to find the root cause. Since it was not a production server we decided to wait a little with the restart and continue looking for the root cause. It was a good decision since on of my colleagues noticed that there were a lot of trace files created at the same time. We have a central mechanism that uses server side traces to profile all our servers and something had gone seriously wrong this weekend. When we issued a select * from ::fn_trace_getinfo(0) on the server it showed us that there were 270 traces running. Probably because of a little bug the tool started an incredible amount of traces on the server.

When we issued a DBCC MEMORYSTATUS we got the following results:

Dynamic Memory Manager Buffers

Stolen 3728
OS Reserved 37528
OS Committed 37506
OS In Use 37502
General 1513
QueryPlan 2420
Optimizer 0
Utilities 36900
Connection 220


According to KB271624 'Utilities' is used by various utilities routines like BCP, Log Manager, parallel queries, ::fn_trace_gettable and others. Others also means buffers used for tracing. Starting the 200+ traces caused our VAS to get heavily fragmented and thus leaving us with a lot of different errors like "There is not enough storage to complete this operation", "Failed to reserve contiguous memory", login failures, etc. As you can see the 'Utilities' memory manager has 36900 buffers which is a massive amount considering the limited MemToLeave size.

So remember, even if you have lots of available memory you can still run out of memory!

Tuesday, July 24, 2007

Visual Studio Team Edition for Database Professionals Service Release 1

Just a quick note for those of you who are lucky enough to work with VSDBPRO.
Service release 1 is available, more info here and KB936612.

Monday, July 23, 2007

Run batch multiple times

Kalen posted an awesome tip that is so simple and yet so cool I just had to refer to it.

Thursday, July 19, 2007

Even more USERSTORE_TOKENPERM

Reading this and this post reminded me of our very bad experience with the USERSTORE_TOKENPERM cache.

When we ran into our USERSTORE_TOKENPERM issue last year there was no real awareness of the problem yet (SQL Server 2005 x64 - Enterprise Edition Build 2153). It was so bad we had to revert to SQL Server 2000 because things got so slow that SQL Server started to become unresponsive. After we got everything back to the original server with SQL Server 2000 we had a couple of weeks of detective work in front of us. Having an issue with this much impact needs to be investigated and the root cause must be found before you can even think about migrating again. We had done all the steps needed before migrating like stress testing, functional jobs, system jobs, ... and yet we did not see the behavior it was showing in production.

We have spent quite some time on the phone with an Escalation Engineer from Microsoft trying to pinpoint the root cause of our problem. After we convinced the engineer that the root cause was not parallelism as he suggested we finally got some vital information that helped us understand what was happening. When he told us about the USERSTORE_TOKENPERM cache and its behavior we started running some tests and now knew what we were looking for. Sure enough the size of this cache just kept growing and the remove count certainly did not.

select
distinct cc.cache_address,
cc.name,
cc.type,
ch.clock_hand,
cc.single_pages_kb + cc.multi_pages_kb as total_kb, cc.single_pages_in_use_kb + cc.multi_pages_in_use_kb as total_in_use_kb,
cc.entries_count, cc.entries_in_use_count, ch.removed_all_rounds_count, ch.removed_last_round_count, rounds_count
from sys.dm_os_memory_cache_counters cc join sys.dm_os_memory_cache_clock_hands ch on (cc.cache_address =
ch.cache_address)
where cc.type = 'USERSTORE_TOKENPERM'
order by total_kb desc

Since it was announced that this problem would be fixed in Service Pack 2 we decided to create some test scripts that really put stress on this cache. We tried to come as close as possible to our production situation at that time which was sp_executesql for the CUD operations (UpdateBatch). Putting pressure on this cache was quite easy, have different query text so they get cached as different query plans and execute them twice since the first time it just gets inserted in the cache and the problem comes with the lookup action. We forced this behavior by executing queries with varchar parameters that increase in length every iteration. Also remember to use a login that is not a sysadmin since this causes SQL Server to skip the TokenAndPermUserStore check. We ended up creating a couple of scripts to do just this and also to monitor the cache size. The 'engine' behind it are simple command files and SQLCMD. Unfortunately I can not post the scripts since they are the property of the customer.

As far as we have seen from our tests the behavior has certainly improved in SP2. It takes a lot longer for the duration of the queries to increase and it does not increase to the point where SQL Server becomes unresponsive nor does the duration become unacceptably high. Since we are really pushing the cache to the limit it does still cause some increase in duration but not nearly to the point where SP1 gets it. We do notice however that under the constant 'hammering' of these queries even SP2 has the greatest trouble to keep the cache under control even when we call DBCC FREESYSTEMCACHE('TokenAndPermUserStore'). Obviously this scenario is not realistic since we really are trying to make it go wrong. Not only did the TokenAndPermUserStore behavior change in SP2 they cut down the size of the plan cache dramatically in SP2 too.

The conclusion however is that under Service Pack 2 we do not experience the same catastrophic behavior as before.

A lot of time has passed and we are going to migrate to SQL Server 2005 on the 9th of September. We are using a whole new .NET Framework on the application tier with a whole new data access layer, so a lot of the factors that existed that last time have now changed. We are starting our stress tests next month, if I get the chance and if we find anything peculiar I will let you know.

Hopefully this time there will not be any surprises :-(

Tuesday, July 17, 2007

The target of 'replace' must be at most one node

One of the new features of SQL Server 2005 is the integration of XML in the database. Although XML in the database is not my favorite architecture there are situations where it is useful. Since we have not really used XML in a real life project before we are slowly learning its limitations.

A limitation we recently discovered is modifying more than 1 node at a time using XML DML. This limitation is documented in the Books Online under the "replace value of" topic: "If multiple nodes are selected, an error is raised."

The only idea we could come up with is to loop through the nodes until all the nodes have been updated. Lucky for us SQL Server also supports mixing XML with special functions like sql:variable() and sql:column().

Check out the code here since blogger is giving me issues with the XML string :-(

Monday, July 16, 2007

Windows Home Server - RTM

There was not such a great buzz about this edition (at least not here in Belgium) but nevertheless it is an interesting product. It contains some nice features like a home network health monitor which makes sure all updates are applied on your home PC's, full system restores, automatic backup and many more.

Check out the announcement here and more information here