Not sure how many of you have dealt with with large volumes of data, but a couple million entries per hour in my database is no unheard of (I average over 1million entries per hour during peak). I need to keep 2 weeks of data in-line.
This left me with a big problem of how to archive old data. Deleting 12 Million+ records overnight was starting to run very long (and blocking the database so that Winsyslog couldn't insert new data). Deleting on the weekend had the same issues since this would run 48+ hours.
So, exporting the data (for longer term archival) before deletion was just out of the question - instead, I write events both to DB and to log file and just zip/archive the log file nightly.
Thanks to a suggestion from a co-worker, I created a stored procedure on MSSQL 2000 server that runs every hour and only deletes the data in 10,000 record increments - with a pause between each incremental delete to ensure WinSyslog can insert new data that it has queued up.
The SP deletes data that is 2 weeks old. So, I'm only deleting 1 hour of data each time since the SP runs hourly.
Some additional improvements to this that I've done are:
1. "Skew" the deletes by 12 hours (so I'm actually deleting data that's 14.5 days old). This way, when I'm getting peak number of events coming into the database at 9:00am, I'm deleting off-peak data from 9:00pm - less to delete means that it finishes faster.
2. Added the "WITH (PAGLOCK)" table hint so that (though the delete takes a bit longer) SQL doesn't lock the whole table - so INSERTs can still occur while I'm deleting the old data. NOTE: in truth, SQL will still escalate the pagelock to a tablelock after some threshold is hit.
If anyone knows how to control this threshold, please let me know
The code for this is below. There are 3 values you'll want to look at before just running the code:
1. The number of events being deleted (10,000) - you need to change both the "IF @currentrows >" statement and the "SET @delcount" values to the same value (these are in adjacent rows).
2. The number of seconds to wait between batch deletes - the "WAITFOR DELAY" statement at the end.
3. You'll probably need to change the "SET @log_datetime" value. This value dictates the data you want to delete (e.g. how old the data is). In my instance I want 14.5 days as explained above. Since I'm also storing everything in GMT time (but my server clock is GMT-5), I also add 5 hours to the time to ensure I'm deleting the expected data.
You shouldn't need to change both #1 and #2, the only one you should NEED to change is the seconds to wait (#2) - either increasing or decreasing this from 10 seconds.
If you read my other posting, you'll know that I have a CLUSTERED index on the ReceivedAt field. This is CRITICAL since performing inequality operators (e.g. less-than-or-equal-to), such as in the DELETE statement, will perform several magnitudes better since the DELETE can seek the index. Not having an index, or even having a NON-CLUSTERED index, will cause a table scan - in other words, SQL will read every record in the table. Table Scans are the worst things for performance and will certainly cause Table Locks.
Anyway, here's the code - I've commented it pretty heavily. I think all of the code except for the table hint are portable to MySQL if anyone wants to try it.
Let me know if you find it useful and/or can improve on it...
- Code: Select all
CREATE PROCEDURE sp_Delete14DayOldData AS
DECLARE @log_datetime DATETIME,
@rowcount INT,
@currentrows INT,
@delcount INT
-- Get the current datetime minus 14 days -- we want to delete anything older than this value
-- Actually offsetting this by 1/2 day (12 hours) to ensure that during peak periods (for example, 9am)
-- a lower number of events are being deleted (9pm). This is to distribute load a bit.
-- Lastly, since I'm storing everything in GMT time and my server is sitting in EST(GMT-5),
-- I need to ADD 5 hours to ensure I'm deleting data that was properly offset by 1/2 day
SET @log_datetime = DateAdd(hh,5,GETDATE() - 14.5)
-- get the total number of rows older than the above value
SET @rowcount =
(SELECT COUNT(*)
FROM SystemEvents
WHERE receivedat <= @log_datetime)
-- set the current number of rows left to delete "currentrows" equal to the total number of rows to delete
set @currentrows = @rowcount
-- as long as there are still more rows to delete "currentrows > 0", then we can continue to loop
WHILE (@currentrows > 0)
BEGIN
-- if there are more than 10,000 rows available to delete, we want to delete a maximum of 10,000 rows
-- else there are less than 10,000 and we want to delete all of them
IF (@currentrows > 10000)
SET @delcount = 10000
ELSE
SET @delcount = @currentrows
-- here's the MAGIC command. This limits the number of rows that the DELETE statement will impact
SET rowcount @delcount
DELETE FROM SystemEvents WITH (PAGLOCK) WHERE receivedat <= @log_datetime
-- now that we've deleted @delcount number of rows, we need to decrement the number of rows left to be deleted
set @currentrows = @currentrows - @delcount
-- if there are more rows left to delete, we want to wait 10 seconds for other processes to be able to complete
-- before locking the table with another delete command
-- otherwise, the number of rows left to delete is zero so the sp will just exit at the end of this loop
IF (@currentrows >0)
waitfor delay '00:00:10'
END
GO