Thursday, January 26, 2006

Slow mass deletes

My favorite Lead DBA Patrick asked me to find out how we could delete millions of rows from a table in the fastest way possible. You guessed it... this too is linked to the archiving project :-)

I was inspired by a blogpost from Kimberly Tripp where she tested mass deletes extensively with all different kinds of indexing.

The conclusion was that deletes from tables with non-clustered indexes was a lot slower as opposed to tables with only a clustered index.

When I started testing the delete options I suddenly noticed that there was a lot of locking activity on TempDB. Freaky as I can get I needed to find out why this was happening.

So off to investigate!

I've created a small table with the following script:

(myID int IDENTITY(1,1),
myChar char(500),
myVarChar varchar(500)


DECLARE @i int
SET @i = 0

WHILE @i < 10000
INSERT INTO TestNCDelete (myChar, myVarChar)
VALUES (' ', REPLICATE('A', 500))

SET @i = @i + 1


CREATE INDEX IX_myChar ON TestNCDelete (myChar)
CREATE INDEX IX_myVarChar ON TestNCDelete (myVarChar)

Let's delete everything from this table:



Check the active locks with sp_lock and there you go... lots and lots of extent locks on TempDB.

Now why is this happening?
It is actually very simple to find out... the query plan says it all!

After the Clustered Index Delete you will see a Table Spool/Eager Spool action for every non-clustered index. The description of this action is quite clear: "Stores the data from the input in a temporary table in order to optimize rewinds". This is followed by a Sort, an Index Delete, a Sequence and finally the delete is final.

You can imagine that these spools, sorts, ... can be quite intrusive when we are talking about 40 to 100 million rows.

Another mystery solved!

Tuesday, January 24, 2006

Fun with RAW Destination files

As I said before one of the hot projects is Archiving. Because it is the first time a functional archiving process will take place we are talking about a lot of data and a small maintenance window. All this makes performance a key factor.

How do you achieve high performance with SSIS exports?
Use the Raw File Destination.

We have tested a couple of parallel exports to Flat File and repeated this action to the Raw File Destination. The export process went from 31 minutes to 26 minutes and the file size decreased to an incredible 2/3 of the size the Flat File Destination took. Now this may not look like a big gain but all this was as easy as changing the destination type. Another problem is that we are reaching our max read performance. I am convinced that there are more tuning options available and will probably be exploring these tomorrow.

The import process is blazing fast but I will be tuning this even more tomorrow too... let's see what we learn from that. Stay tuned!

Monday, January 23, 2006

Fun with Lookups in SSIS

One of the current hot projects here is archiving (the database grows with +1GB/day). For one of the solutions we need two files from the same table based on a staging table. This is now done in 2 steps so I tried to figure out a way to read the table only once.

I noticed that the Lookup Transformation has an option to redirect a row when an error occurs. This enables us to separate the output to 2 separate files based on the lookup table.

I started testing and I was thrilled by the performance of these transformations. So off to do the 'real' work, a lookup table with 40.000.000 records (holding just a guid) to export a table with 120.000.000 records. Unfortunately I hit the first 32-bit limitation :-( Because of the massive amount of data SSIS is having trouble reading all the records in memory. I get up to about 38.000.000 rows and then the process stalls. After quite some time I get the following error:
The Data Flow task failed to create a required thread and cannot begin running. The usually occurs when there is an out-of-memory state.

SSIS cannot use AWE memory so it depends entirely on the VAS memory meaning only 2GB (or 3 with /3GB enabled) on a 32-bit environment. If we were running a 64-bit environment all my worries would be over :-(

I could skip the cache or cache less data but then it becomes quite slow and the 'double' export runs better in that case.

Who knows... maybe tomorrow I will find a solution.