TempDB 中的損壞分區如何導致 DBCC CHECKDB 報告沒有問題?
我們的一個 SQL Server 最近報告了以下錯誤:
DATE/TIME: 2/25/2013 9:15:14 PM DESCRIPTION: No catalog entry found for partition ID 9079262474267394048 in database 2. The metadata is inconsistent. Run DBCC CHECKDB to check for a metadata corruption.
不到 15 分鐘後,我連接到伺服器並執行:
SELECT name FROM sys.databases WHERE database_id = 2;
哪個返回’tempdb’。然後我跑了:
DBCC CHECKDB ('tempdb') WITH NO_INFOMSGS, TABLERESULTS;
沒有返回任何結果,表明受影響的數據庫沒有問題。
數據庫中的損壞如何導致上述錯誤消息
DBCC CHECKDB
但未報告問題?我假設如果頁面校驗和計算失敗,導致頁面被標記為懷疑引用該頁面的任何對像都無法刪除,但我一定是錯的。一旦頁面被標記為“可疑”,如何將其標記為不可疑、已修復或重複使用,或者任何
DBCC CHECKDB
不會報告相關頁面有任何問題的內容?編輯:2013-02-27 13:24
只是為了好玩,我試圖在 TempDB 中重新創建損壞,假設 #temp 表是罪魁禍首。
但是,由於我無法
SINGLE_USER
在 TempDB 中設置該選項,我無法使用它DBCC WRITEPAGE
來破壞頁面,因此我無法在 TempDB 中強制破壞。而不是使用
DBCC WRITEPAGE
一個可以將數據庫設置為離線並使用十六進制編輯器來修改 db 文件中的隨機字節。當然,這在 TempDB 上也不起作用,因為數據庫引擎無法在 TempDB 離線的情況下執行。如果停止實例,下次啟動時會自動重新創建 TempDB;因此,這也不會成功。
如果有人能想出一種方法來重現這種腐敗,我願意做進一步的研究。
為了測試損壞頁面無法修復的假設,
DROP TABLE
我創建了一個測試數據庫並使用以下腳本損壞頁面,然後嘗試刪除受影響的表。此處的結果是無法刪除該表;我不得不RESTORE DATABASE Testdb PAGE = ''...
為了恢復受影響的頁面。我假設如果我對相關頁面的其他部分進行了更改,也許該頁面可以用DROP TABLE
或者TRUNCATE table
./* ********************************************* */ /* ********************************************* */ /* DO NOT USE THIS CODE ON A PRODUCTION SYSTEM!! */ /* ********************************************* */ /* ********************************************* */ USE Master; GO ALTER DATABASE test SET RECOVERY FULL; BACKUP DATABASE Test TO DISK = 'Test_db.bak' WITH FORMAT , INIT , NAME = 'Test Database backup' , SKIP , NOREWIND , NOUNLOAD , COMPRESSION , STATS = 1; BACKUP LOG Test TO DISK = 'Test_log.bak' WITH FORMAT , INIT , NAME = 'Test Log backup' , SKIP , NOREWIND , NOUNLOAD , COMPRESSION , STATS = 1; GO ALTER DATABASE test SET SINGLE_USER; GO USE Test; GO IF EXISTS (SELECT name FROM sys.key_constraints WHERE name = 'PK_temp') ALTER TABLE temp DROP CONSTRAINT PK_temp; IF EXISTS (SELECT name FROM sys.default_constraints WHERE name = 'DF_temp_testdata') ALTER TABLE temp DROP CONSTRAINT DF_temp_testdata; IF EXISTS (SELECT name FROM sys.tables WHERE name = 'temp') DROP TABLE temp; GO CREATE TABLE temp ( tempID INT NOT NULL CONSTRAINT PK_temp PRIMARY KEY CLUSTERED IDENTITY(1,1) , testdata uniqueidentifier CONSTRAINT DF_temp_testdata DEFAULT (NEWID()) ); GO /* insert 10 rows into #temp */ INSERT INTO temp default values; GO 10 /* get some necessary parameters */ DECLARE @partitionID bigint; DECLARE @dbid smallint; DECLARE @tblid int; DECLARE @indexid int; DECLARE @pageid bigint; DECLARE @offset INT; DECLARE @fileid INT; SELECT @dbid = db_id('Test') , @tblid = t.object_id , @partitionID = p.partition_id , @indexid = i.index_id FROM sys.tables t INNER JOIN sys.partitions p ON t.object_id = p.object_id INNER JOIN sys.indexes i on t.object_id = i.object_id WHERE t.name = 'temp'; SELECT TOP(1) @fileid = file_id FROM sys.database_files; SELECT TOP(1) @pageid = allocated_page_page_id FROM sys.dm_db_database_page_allocations(@dbid, @tblid, null, @partitionID, 'LIMITED') WHERE allocation_unit_type = 1; /* get a random offset into the 8KB page */ SET @offset = FLOOR(rand() * 8192); SELECT @offset; /* 0x75 below is the letter 't' */ DBCC WRITEPAGE (@dbid, @fileid, @pageid, @offset, 1, 0x74, 1); SELECT * FROM temp; Msg 824, Level 24, State 2, Line 36 SQL Server detected a logical consistency-based I/O error: incorrect checksum (expected: 0x298b2ce9; actual: 0x2ecb2ce9). It occurred during a read of page (1:1054) in database ID 7 at offset 0x0000000083c000 in file 'C:\SQLServer \MSSQL11.MSSQLSERVER\MSSQL\DATA\Test.mdf'. Additional messages in the SQL Server error log or system event log may provide more detail. This is a severe error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.
此時您與數據庫引擎斷開連接,因此重新連接以繼續。
USE Test; DBCC CHECKDB WITH NO_INFOMSGS, TABLERESULTS;
這里報告了腐敗。
DROP TABLE temp; Msg 824, Level 24, State 2, Line 36 SQL Server detected a logical consistency-based I/O error: incorrect checksum (expected: 0x298b2ce9; actual: 0x2ecb2ce9). It occurred during a read of page (1:1054) in database ID 7 at offset 0x0000000083c000 in file 'C:\SQLServer \MSSQL11.MSSQLSERVER\MSSQL\DATA\Test.mdf'. Additional messages in the SQL Server error log or system event log may provide more detail. This is a severe error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.
此處報告腐敗,
DROP TABLE
失敗。/* assuming ENTERPRISE or DEVELOPER edition of SQL Server, I can use PAGE='' to restore a single page from backup */ USE Master; RESTORE DATABASE Test PAGE = '1:1054' FROM DISK = 'Test_db.bak'; BACKUP LOG Test TO DISK = 'Test_log_1.bak'; RESTORE LOG Test FROM DISK = 'Test_log.bak'; RESTORE LOG Test FROM DISK = 'Test_log_1.bak';
編輯#2,添加請求的@@VERSION 資訊。
SELECT @@VERSION;
回報:
Microsoft SQL Server 2012 (SP1) - 11.0.3000.0 (X64) Oct 19 2012 13:38:57 Copyright (c) Microsoft Corporation Enterprise Evaluation Edition (64-bit) on Windows NT 6.2 <X64> (Build 9200: )
我知道這是評估版,我們有企業版的密鑰,很快就會進行版本升級。
這是一個已知問題的修復:
修復:使用 SQL Server 2012 時出現“未找到數據庫中分區 ID 的目錄條目”錯誤
假設您在 Microsoft SQL Server 2012 中查詢 tempdb.sys.allocation_units 表。當您在查詢中使用 NOLOCK 提示或查詢處於 READ UNCOMMITED 事務隔離級別下時,您會收到以下間歇性 608 錯誤消息:
錯誤:608 嚴重性:16 狀態:1
沒有為數據庫中的分區找到目錄條目。元數據不一致。執行 DBCC CHECKDB 以檢查元數據損壞
注意 DBCC CHECKDB 命令不顯示任何數據庫損壞的跡象。
固定在:
您的版本 (11.0.3000.0) 是 SQL Server 2012 SP1 RTM