Sql-Server
SQL Server 檢測到基於邏輯一致性的 I/O 錯誤:RESTORE 上的 pageid 不正確
將我們的數據庫恢復到新的伺服器硬體失敗並出現頁面錯誤。
Message SQL Server detected a logical consistency-based I/O error: incorrect pageid (expected 49:8125916; actual 49:29097436). It occurred during a read of page (49:8125916) in database ID 7 at offset 0x00000f7fbb8000 in file x:\Databases\Data07\DataWarehouse_OperationalData_15.ndf'. Additional messages in the SQL Server error log or system event log may provide more detail. This is a severe error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.
在這個 70 TB 數據庫和 6 年曆史的伺服器上,DBCC CheckDB 將花費我們大約一周的時間。
有沒有可能是我的備份損壞了?
還是說新伺服器有問題?
還是問題出在舊伺服器上的生產數據庫?
這是 SQL Server 2016 SP1 CU1,page_verify_option_desc 是 CHECKSUM。
這是我從新伺服器上的錯誤日誌中可以看到的:
Starting up database 'DataWarehouse'. The database 'DataWarehouse' is marked RESTORING and is in a state that does not allow recovery to be run. Error: 824, Severity: 16, State: 2. SQL Server detected a logical consistency-based I/O error: incorrect pageid (expected 49:8125916; actual 49:29097436). It occurred during a read of page (49:8125916) in database ID 7 at offset 0x00000f7fbb8000 in file 'S:\MSSQL\DSA\Databases\DataWarehouse\Data07\DataWarehouse_OperationalData_15.ndf'. Additional messages in the SQL Server error log or system event log may provide more detail. This is a severe error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online. Error: 18456, Severity: 14, State: 38.
當我在舊伺服器上執行它時:
DBCC TRACEON (3604); DBCC PAGE ('datawarehouse', 49, 8125916, 0);
我明白了
DBCC execution completed. If DBCC printed error messages, contact your system administrator. PAGE: (49:8125916) BUFFER: BUF @0x000000802C5E9300 bpage = 0x0000004DF03E8000 bhash = 0x0000000000000000 bpageno = (49:8125916) bdbid = 5 breferences = 1 bcputicks = 0 bsampleCount = 0 bUse1 = 5844 bstat = 0x9 blog = 0x15ab215a bnext = 0x0000000000000000 bDirtyContext = 0x0000000000000000 bstat2 = 0x0 PAGE HEADER: Page @0x0000004DF03E8000 m_pageId = (49:8125916) m_headerVersion = 1 m_type = 3 m_typeFlagBits = 0x0 m_level = 0 m_flagBits = 0xa200 m_objId (AllocUnitId.idObj) = 10814647 m_indexId (AllocUnitId.idInd) = 256 Metadata: AllocUnitId = 72058302786633728 Metadata: PartitionId = 72058370492596224 Metadata: IndexId = 1 Metadata: ObjectId = 1916450641 m_prevPage = (0:0) m_nextPage = (0:0) pminlen = 0 m_slotCnt = 2 m_freeCnt = 2452 m_freeData = 5764 m_reservedCnt = 0 m_lsn = (2455108:11140830:20) m_xactReserved = 0 m_xdesId = (28:447547798) m_ghostRecCnt = 0 m_tornBits = 31453216 DB Frag ID = 1 Allocation Status GAM (49:7668480) = ALLOCATED SGAM (49:7668481) = NOT ALLOCATED PFS (49:8120352) = 0x42 ALLOCATED 80_PCT_FULL DIFF (49:7668486) = CHANGED ML (49:7668487) = NOT MIN_LOGGED DBCC execution completed. If DBCC printed error messages, contact your system administrator.
DBCC CHECKFILEGROUP 目前正在執行。完成沒有錯誤。
SeanGallardy 評論了二進制數。我花了很長時間才明白他的意思,但我終於在程序員模式的 Windows 計算器中輸入了這兩個數字,它顯示的內容如下:
更新:我們進行了另一次備份並從舊伺服器恢復到新伺服器。這次我們收到了這樣的資訊:
Could not redo log record (2456609:4261461:64), for transaction ID (28:972770238), on page (48:211577379), allocation unit 72058359886184448, database 'DataWarehouse' (database ID 7). Page: LSN = (2456609:3279166:236), allocation unit = 72058351460417536, type = 20. Log: OpCode = 2, context 5, PrevPageLSN: (2456609:4250688:2). Restore from a backup of the database, or repair the database.
我試圖用Google搜尋這條消息,但我能找到的唯一建議是恢復數據庫(這是我想要做的)。
真正的問題是;接下來我該怎麼辦?
真正的問題是;接下來我該怎麼辦?
這可能聽起來不像是 DBA.SE 問題的“答案”,但坦率地說,您在這裡唯一合理的選擇是向 Microsoft 提出支持案例並尋求幫助,或者聘請有信譽的專業人士(顧問或公司)修復 SQL Server 損壞問題的可靠記錄。如果您買不起這些選項,則意味著數據不值得。