SQL Server – Undelete a Table and Restore a Single Table from Backup

T-SQL Tuesday

This post is part of the monthly community event called T-SQL Tuesday started by Adam Machanic (blog|twitter) and hosted by someone else each month. This month the host is Sankar Reddy (blog|twitter) and the topic is Misconceptions in SQL Server. You can follow posts for this theme on Twitter by looking at #TSQL2sDay hashtag.

Let me start by saying: This code is a crazy hack that is to never be used unless you really, really have to. Really!

And I don’t think there’s a time when you would really have to use it for real. Because it’s a hack there are number of things that can go wrong so play with it knowing that. I’ve managed to totally corrupt one database. :) Oh… and for those saying: yeah yeah.. you have a single table in a file group and you’re restoring that, I say “nay nay” to you.

As we all know SQL Server can’t do single table restores from backup. This is kind of a obvious thing due to different relational integrity (RI) concerns. Since we have to maintain that we have to restore all tables represented in a RI graph. For this exercise i say BAH! to those concerns. Note that this method “works” only for simple tables that don’t have LOB and off rows data. The code can be expanded to include those but I’ve tried to leave things “simple”. Note that for this to work our table needs to be relatively static data-wise. This doesn’t work for OLTP table. Products are a perfect example of static data. They don’t change much between backups, pretty much everything depends on them and their table is one of those tables that are relatively easy to accidentally delete everything from.

This only works if the database is in Full or Bulk-Logged recovery mode for tables where the contents have been deleted or truncated but NOT when a table was dropped.

Everything we’ll talk about has to be done before the data pages are reused for other purposes. After deletion or truncation the pages are marked as reusable so you have to act fast. The best thing probably is to put the database into single user mode ASAP while you’re performing this procedure and return it to multi user after you’re done.

How do we do it?

We will be using an undocumented but known DBCC commands: DBCC PAGE, an undocumented function sys.fn_dblog and a little known DATABASE RESTORE PAGE option. All tests will be on a copy of Production.Product table in AdventureWorks database called Production.Product1 because the original table has FK constraints that prevent us from truncating it for testing.

-- create a duplicate table. This doesn't preserve indexes!
SELECT *
INTO AdventureWorks.Production.Product1
FROM AdventureWorks.Production.Product
 
After we run this code take a full back to perform further testing.
 

First let’s see what the difference between DELETE and TRUNCATE is when it comes to logging. With DELETE every row deletion is logged in the transaction log. With TRUNCATE only whole data page deallocations are logged in the transaction log. Getting deleted data pages is simple. All we have to look for is row delete entry in the sys.fn_dblog output. But getting data pages that were truncated from the transaction log presents a bit of an interesting problem. I will not go into depths of IAM(Index Allocation Map) and PFS (Page Free Space) pages but suffice to say that every IAM page has intervals that tell us which data pages are allocated for a table and which aren’t. If we deep dive into the sys.fn_dblog output we can see that once you truncate a table all the pages in all the intervals are deallocated and this is shown in the PFS page transaction log entry as deallocation of pages. For every 8 pages in the same extent there is one PFS page row in the transaction log. This row holds information about all 8 pages in CSV format which means we can get to this data with some parsing. A great help for parsing this stuff is Peter Debetta’s handy function dbo.HexStrToVarBin that converts hexadecimal string into a varbinary value that can be easily converted to integer tus giving us a readable page number.

The shortened (columns removed) sys.fn_dblog output for a PFS page with CSV data for 1 extent (8 data pages) looks like this:

-- [Page ID] is displayed in hex format. 
-- To convert it to readable int we'll use dbo.HexStrToVarBin function found at
-- http://sqlblog.com/blogs/peter_debetta/archive/2007/03/09/t-sql-convert-hex-string-to-varbinary.aspx
-- This function must be installed in the master database
SELECT Context, AllocUnitName, [Page ID], Description
FROM sys.fn_dblog(NULL, NULL)
WHERE [Current LSN] = '00000031:00000a46:007d'

image

The pages at the end marked with 0x00—> are pages that are allocated in the extent but are not part of a table.

We can inspect the raw content of each data page with a DBCC PAGE command:

-- we need this trace flag to redirect output to the query window.
DBCC TRACEON (3604);
-- WITH TABLERESULTS gives us data in table format instead of message format
-- we use format option 3 because it's the easiest to read and manipulate further on
DBCC PAGE (AdventureWorks, 1, 613, 3) WITH TABLERESULTS
  

Since the DBACC PAGE output can be quite extensive I won’t put it here. You can see an example of it in the link at the beginning of this section.

Getting deleted data back

When we run a delete statement every row to be deleted is marked as a ghost record. A background process periodically cleans up those rows. A huge misconception is that the data is actually removed. It’s not. Only the pointers to the rows are removed while the data itself is still on the data page. We just can’t access it with normal means. To get those pointers back we need to restore every deleted page using the RESTORE PAGE option mentioned above. This restore must be done from a full backup, followed by any differential and log backups that you may have. This is necessary to bring the pages up to the same point in time as the rest of the data.  However the restore doesn’t magically connect the restored page back to the original table. It simply replaces the current page with the one from the backup. After the restore we use the DBCC PAGE to read data directly from all data pages and insert that data into a temporary table. To finish the RESTORE PAGE  procedure we finally have to take a tail log backup (simple backup of the transaction log) and restore it back. We can now insert data from the temporary table to our original table by hand.

Getting truncated data back

When we run a truncate the truncated data pages aren’t touched at all. Even the pointers to rows stay unchanged. Because of this getting data back from truncated table is simple. we just have to find out which pages belonged to our table and use DBCC PAGE to read data off of them. No restore is necessary. Turns out that the problems we had with finding the data pages is alleviated by not having to do a RESTORE PAGE procedure.

Stop stalling… show me The Code!

This is the code for getting back deleted and truncated data back. It’s commented in all the right places so don’t be afraid to take a closer look. Make sure you have a full backup before trying this out. Also I suggest that the last step of backing and restoring the tail log is performed by hand.

USE master
GO
IF OBJECT_ID('dbo.HexStrToVarBin') IS NULL
RAISERROR ('No dbo.HexStrToVarBin installed.
Go to http://sqlblog.com/blogs/peter_debetta/archive/2007/03/09/t-sql-convert-hex-string-to-varbinary.aspx
and install it in master database'
, 18, 1)
SET NOCOUNT ON
BEGIN TRY
DECLARE @dbName VARCHAR(1000), @schemaName VARCHAR(1000), @tableName VARCHAR(1000),
@fullBackupName VARCHAR(1000), @undeletedTableName VARCHAR(1000),
@sql VARCHAR(MAX), @tableWasTruncated bit;
/*
THE FIRST LINE ARE OUR INPUT PARAMETERS
In this case we're trying to recover Person.Address1 table in AdventureWorks database.
My full backup of AdventureWorks database is at e:\AW.bak
*/
SELECT @dbName = 'AdventureWorks', @schemaName = 'Production', @tableName = 'Product1', @fullBackupName = 'e:\AW.bak',
@undeletedTableName = '##' + @tableName + '_Undeleted', @tableWasTruncated = 0,
-- copy the structure from original table to a temp table that we'll fill with restored data
@sql = 'IF OBJECT_ID(''tempdb..' + @undeletedTableName +
''') IS NOT NULL DROP TABLE ' + @undeletedTableName +
' SELECT *' +
' INTO ' + @undeletedTableName +
' FROM [' + @dbName + '].[' + @schemaName + '].[' + @tableName + ']' +
' WHERE 1 = 0'
EXEC (@sql)
IF OBJECT_ID('tempdb..#PagesToRestore') IS NOT NULL
DROP TABLE #PagesToRestore
/* FIND DATA PAGES WE NEED TO RESTORE*/
CREATE TABLE #PagesToRestore ([ID] INT IDENTITY(1,1), [FileID] INT, [PageID] INT,
[SQLtoExec] VARCHAR(1000)) -- DBCC PACE statement to run later
RAISERROR ('Looking for deleted pages...', 10, 1)
-- use T-LOG direct read to get deleted data pages
INSERT INTO #PagesToRestore([FileID], [PageID], [SQLtoExec])
EXEC('USE [' + @dbName + '];SELECT FileID, PageID, ''DBCC TRACEON (3604); DBCC PAGE ([' + @dbName +
'], '' + FileID + '', '' + PageID + '', 3) WITH TABLERESULTS'' as SQLToExec
FROM (SELECT DISTINCT LEFT([Page ID], 4) AS FileID, CONVERT(VARCHAR(100), '
+
'CONVERT(INT, master.dbo.HexStrToVarBin(SUBSTRING([Page ID], 6, 20)))) AS PageID
FROM sys.fn_dblog(NULL, NULL)
WHERE AllocUnitName LIKE '
'%' + @schemaName + '.' + @tableName + '%'' ' +
'AND Context IN (''LCX_MARK_AS_GHOST'', ''LCX_HEAP'') AND Operation in (''LOP_DELETE_ROWS''))t');
SELECT *
FROM #PagesToRestore
-- if upper EXEC returns 0 rows it means the table was truncated so find truncated pages
IF (SELECT COUNT(*) FROM #PagesToRestore) = 0
BEGIN
RAISERROR ('No deleted pages found. Looking for truncated pages...', 10, 1)
-- use T-LOG read to get truncated data pages
INSERT INTO #PagesToRestore([FileID], [PageID], [SQLtoExec])
-- dark magic happens here
-- because truncation simply deallocates pages we have to find out which pages were deallocated.
-- we can find this out by looking at the PFS page row's Description column.
-- for every deallocated extent the Description has a CSV of 8 pages in that extent.
-- then it's just a matter of parsing it.
-- we also remove the pages in the extent that weren't allocated to the table itself
-- marked with '0x00-->00'
EXEC ('USE [' + @dbName + '];DECLARE @truncatedPages TABLE(DeallocatedPages VARCHAR(8000), IsMultipleDeallocs BIT);
INSERT INTO @truncatedPages
SELECT REPLACE(REPLACE(Description, '
'Deallocated '', ''Y''), ''0x00-->00 '', ''N'') + '';'' AS DeallocatedPages,
CHARINDEX('
';'', Description) AS IsMultipleDeallocs
FROM (
SELECT DISTINCT LEFT([Page ID], 4) AS FileID, CONVERT(VARCHAR(100),
CONVERT(INT, master.dbo.HexStrToVarBin(SUBSTRING([Page ID], 6, 20)))) AS PageID,
Description
FROM sys.fn_dblog(NULL, NULL)
WHERE Context IN ('
'LCX_PFS'') AND Description LIKE ''Deallocated%''
AND AllocUnitName LIKE '
'%' + @schemaName + '.' + @tableName + '%'') t;
SELECT FileID, PageID
, '
'DBCC TRACEON (3604); DBCC PAGE ([' + @dbName + '], '' + FileID + '', '' + PageID + '', 3) WITH TABLERESULTS'' as SQLToExec
FROM (
SELECT LEFT(PageAndFile, 1) as WasPageAllocatedToTable
, SUBSTRING(PageAndFile, 2, CHARINDEX('
':'', PageAndFile) - 2 ) as FileID
, CONVERT(VARCHAR(100), CONVERT(INT,
master.dbo.HexStrToVarBin(SUBSTRING(PageAndFile, CHARINDEX('
':'', PageAndFile) + 1, LEN(PageAndFile))))) as PageID
FROM (
SELECT SUBSTRING(DeallocatedPages, delimPosStart, delimPosEnd - delimPosStart) as PageAndFile, IsMultipleDeallocs
FROM (
SELECT *,
CHARINDEX('
';'', DeallocatedPages)*(N-1) + 1 AS delimPosStart,
CHARINDEX('
';'', DeallocatedPages)*N
AS delimPosEnd
FROM @truncatedPages t1
CROSS APPLY
(SELECT TOP (case when t1.IsMultipleDeallocs = 1 then 8 else 1 end)
ROW_NUMBER() OVER(ORDER BY number) as N
FROM master..spt_values) t2
)t)t)t
WHERE WasPageAllocatedToTable = '
'Y''')
SELECT @tableWasTruncated = 1
END
DECLARE @lastID INT, @pagesCount INT
SELECT @lastID = 1, @pagesCount = COUNT(*) FROM #PagesToRestore
SELECT @sql = 'Number of pages to restore: ' + CONVERT(VARCHAR(10), @pagesCount)
IF @pagesCount = 0
RAISERROR ('No data pages to restore.', 18, 1)
ELSE
RAISERROR (@sql, 10, 1)
-- If the table was truncated we'll read the data directly from data pages without restoring from backup
IF @tableWasTruncated = 0
BEGIN
-- RESTORE DATA PAGES FROM FULL BACKUP IN BATCHES OF 200
WHILE @lastID <= @pagesCount
BEGIN
-- create CSV string of pages to restore
SELECT @sql = STUFF((SELECT ',' + CONVERT(VARCHAR(100), FileID) + ':' + CONVERT(VARCHAR(100), PageID)
FROM #PagesToRestore WHERE ID BETWEEN @lastID AND @lastID + 200
ORDER BY ID FOR XML PATH('')), 1, 1, '')
SELECT @sql = 'RESTORE DATABASE [' + @dbName + '] PAGE = ''' + @sql + ''' FROM DISK = ''' + @fullBackupName + ''''
RAISERROR ('Starting RESTORE command:' , 10, 1) WITH NOWAIT;
RAISERROR (@sql , 10, 1) WITH NOWAIT;
EXEC(@sql);
RAISERROR ('Restore DONE' , 10, 1) WITH NOWAIT;
SELECT @lastID = @lastID + 200
END
/*
If you have any differential or transaction log backups you
should restore them here to bring the previously restored data pages up to date
*/
END
DECLARE @dbccSinglePage TABLE
(
[ParentObject] NVARCHAR(500),
[Object] NVARCHAR(500),
[Field] NVARCHAR(500),
[VALUE] NVARCHAR(MAX)
)
DECLARE @cols NVARCHAR(MAX), @paramDefinition NVARCHAR(500), @SQLtoExec VARCHAR(1000),
@FileID VARCHAR(100), @PageID VARCHAR(100), @i INT = 1
-- Get deleted table columns from information_schema view
-- Need sp_executeSQL because database name can't be passed in as variable
SELECT @cols = 'select @cols = STUFF((SELECT '', ['' + COLUMN_NAME + '']''
FROM '
+ @dbName + '.INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = '
'' + @tableName + ''' AND
TABLE_SCHEMA = '
'' + @schemaName + '''
ORDER BY ORDINAL_POSITION
FOR XML PATH('
''')), 1, 2, '''')', @paramDefinition = N'@cols nvarchar(max) OUTPUT'
EXECUTE sp_executesql @cols, @paramDefinition, @cols = @cols OUTPUT
-- Loop through all the restored data pages,
-- read data from them and insert them into temp table
-- which you can then insert into the orignial deleted table
DECLARE dbccPageCursor CURSOR GLOBAL FORWARD_ONLY FOR
SELECT [FileID], [PageID], [SQLtoExec] FROM #PagesToRestore ORDER BY [FileID], [PageID]
OPEN dbccPageCursor;
FETCH NEXT FROM dbccPageCursor INTO @FileID, @PageID, @SQLtoExec;
WHILE @@FETCH_STATUS = 0
BEGIN
RAISERROR ('---------------------------------------------', 10, 1) WITH NOWAIT;
SELECT @sql = 'Loop iteration: ' + CONVERT(VARCHAR(10), @i);
RAISERROR (@sql, 10, 1) WITH NOWAIT;

SELECT @sql = 'Running: ' + @SQLtoExec
RAISERROR (@sql, 10, 1) WITH NOWAIT;

-- if something goes wrong with DBCC execution or data gathering, skip it but print error
BEGIN TRY
INSERT INTO @dbccSinglePage EXEC (@SQLtoExec)

-- make the data insert magic happen here
IF (SELECT CONVERT(BIGINT, [VALUE]) FROM @dbccSinglePage WHERE [Field] LIKE '%Metadata: ObjectId%')
= OBJECT_ID('['+@dbName+'].['+@schemaName +'].['+@tableName+']')
BEGIN
DELETE @dbccSinglePage
WHERE NOT ([ParentObject] LIKE 'Slot % Offset %' AND [Object] LIKE 'Slot % Column %')

SELECT @sql = 'USE tempdb; ' +
'IF (OBJECTPROPERTY(object_id(''' + @undeletedTableName + '''), ''TableHasIdentity'') = 1) ' +
'SET IDENTITY_INSERT ' + @undeletedTableName + ' ON; ' +
'INSERT INTO ' + @undeletedTableName +
'(' + @cols + ') ' +
STUFF((SELECT ' UNION ALL SELECT ' +
STUFF((SELECT ', ' + CASE WHEN VALUE = '[NULL]' THEN 'NULL' ELSE '''' + [VALUE] + '''' END
FROM (
-- the unicorn help here to correctly set ordinal numbers of columns in a data page
-- it's turning STRING order into INT order (1,10,11,2,21 into 1,2,..10,11...21)
SELECT [ParentObject], [Object], Field, VALUE,
RIGHT('00000' + O1, 6) AS ParentObjectOrder,
RIGHT('00000' + REVERSE(LEFT(O2, CHARINDEX(' ', O2)-1)), 6) AS ObjectOrder
FROM (
SELECT [ParentObject], [Object], Field, VALUE,
REPLACE(LEFT([ParentObject],
CHARINDEX('Offset', [ParentObject])-1),
'Slot ', '') AS O1,
REVERSE(LEFT([Object],
CHARINDEX('Offset ', [Object])-2)) AS O2
FROM @dbccSinglePage
WHERE t.ParentObject = ParentObject )t)t
ORDER BY ParentObjectOrder, ObjectOrder
FOR XML PATH('')), 1, 2, '')
FROM @dbccSinglePage t
GROUP BY ParentObject
FOR XML PATH('')
), 1, 11, '') + ';'
RAISERROR (@sql, 10, 1) WITH NOWAIT;
EXEC (@sql)
END
END TRY
BEGIN CATCH
SELECT @sql = 'ERROR!!!' + CHAR(10) + CHAR(13) +
'ErrorNumber: ' + ERROR_NUMBER() + '; ErrorMessage' + ERROR_MESSAGE() +
CHAR(10) + CHAR(13) + 'FileID: ' + @FileID + '; PageID: ' + @PageID
RAISERROR (@sql, 10, 1) WITH NOWAIT;
END CATCH
DELETE @dbccSinglePage
SELECT @sql = 'Pages left to process: ' + CONVERT(VARCHAR(10), @pagesCount - @i) +
CHAR(10) + CHAR(13) + CHAR(10) + CHAR(13) + CHAR(10) + CHAR(13), @i = @i+1
RAISERROR (@sql, 10, 1) WITH NOWAIT;
FETCH NEXT FROM dbccPageCursor INTO @FileID, @PageID, @SQLtoExec;
END
CLOSE dbccPageCursor; DEALLOCATE dbccPageCursor;
EXEC ('SELECT ''' + @undeletedTableName + ''' as TableName; SELECT * FROM ' + @undeletedTableName)
END TRY
BEGIN CATCH
SELECT ERROR_NUMBER() AS ErrorNumber, ERROR_MESSAGE() AS ErrorMessage
IF CURSOR_STATUS ('global', 'dbccPageCursor') >= 0
BEGIN
CLOSE dbccPageCursor;
DEALLOCATE dbccPageCursor;
END
END CATCH

-- if the table was deleted we need to finish the restore page sequence
IF @tableWasTruncated = 0
BEGIN
-- take a log tail backup and then restore it to complete page restore process
DECLARE @currentDate VARCHAR(30)
SELECT @currentDate = CONVERT(VARCHAR(30), GETDATE(), 112)

RAISERROR ('Starting Log Tail backup to c:\Temp ...', 10, 1) WITH NOWAIT;
PRINT ('BACKUP LOG [' + @dbName + '] TO DISK = ''c:\Temp\' + @dbName + '_TailLogBackup_' + @currentDate + '.trn''')
EXEC ('BACKUP LOG [' + @dbName + '] TO DISK = ''c:\Temp\' + @dbName + '_TailLogBackup_' + @currentDate + '.trn''')
RAISERROR ('Log Tail backup done.', 10, 1) WITH NOWAIT;

RAISERROR ('Starting Log Tail restore from c:\Temp ...', 10, 1) WITH NOWAIT;
PRINT ('RESTORE LOG [' + @dbName + '] FROM DISK = ''c:\Temp\' + @dbName + '_TailLogBackup_' + @currentDate + '.trn''')
EXEC ('RESTORE LOG [' + @dbName + '] FROM DISK = ''c:\Temp\' + @dbName + '_TailLogBackup_' + @currentDate + '.trn''')
RAISERROR ('Log Tail restore done.', 10, 1) WITH NOWAIT;
END

-- The last step is manual. Insert data from our temporary table to the original deleted table

The misconception here is that you can do a single table restore properly in SQL Server. You can't. But with little experimentation you can get pretty close to it.

One way to possible remove a dependency on a backup to retrieve deleted pages is to quickly run a similar script to the upper one that gets data directly from data pages while the rows are still marked as ghost records. It could be done if we could beat the ghost record cleanup task.


Avtor: Anonymous, objavljeno na portalu SloDug.si (Arhiv)

Leave a comment

Please note that we won't show your email to others, or use it for sending unwanted emails. We will only use it to render your Gravatar image and to validate you as a real person.