無法在非唯一索引上插入重複的鍵行?
在過去的幾天裡,在 8 週沒有錯誤之後,我們遇到了 3 次這個奇怪的錯誤,我很難過。
這是錯誤消息:
Executing the query "EXEC dbo.MergeTransactions" failed with the following error: "Cannot insert duplicate key row in object 'sales.Transactions' with unique index 'NCI_Transactions_ClientID_TransactionDate'. The duplicate key value is (1001, 2018-12-14 19:16:29.00, 304050920).".
我們擁有的索引不是唯一的。如果您注意到,錯誤消息中的重複鍵值甚至沒有與索引對齊。奇怪的是,如果我重新執行 proc,它會成功。
這是我能找到的最新連結,它有我的問題,但我沒有看到解決方案。
關於我的場景的幾件事:
- proc 正在更新 TransactionID(主鍵的一部分) - 我認為這是導致錯誤的原因,但不知道為什麼?我們將刪除該邏輯。
- 在表上啟用更改跟踪
- 做事務讀取未送出
每個表有45個欄位,我主要列出了索引中使用的欄位。我正在更新語句中的 TransactionID(聚集鍵)(不必要)。奇怪的是,直到上週我們幾個月都沒有遇到任何問題。它只是通過 SSIS 偶爾發生。
桌子
USE [DB] GO /****** Object: Table [sales].[Transactions] Script Date: 5/29/2019 1:37:49 PM ******/ SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[sales].[Transactions]') AND type in (N'U')) BEGIN CREATE TABLE [sales].[Transactions] ( [TransactionID] [bigint] NOT NULL, [ClientID] [int] NOT NULL, [TransactionDate] [datetime2](2) NOT NULL, /* snip*/ [BusinessUserID] [varchar](150) NOT NULL, [BusinessTransactionID] [varchar](150) NOT NULL, [InsertDate] [datetime2](2) NOT NULL, [UpdateDate] [datetime2](2) NOT NULL, CONSTRAINT [PK_Transactions_TransactionID] PRIMARY KEY CLUSTERED ( [TransactionID] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, DATA_COMPRESSION=PAGE) ON [DB_Data] ) ON [DB_Data] END GO USE [DB] IF NOT EXISTS (SELECT * FROM sys.indexes WHERE object_id = OBJECT_ID(N'[sales].[Transactions]') AND name = N'NCI_Transactions_ClientID_TransactionDate') begin CREATE NONCLUSTERED INDEX [NCI_Transactions_ClientID_TransactionDate] ON [sales].[Transactions] ( [ClientID] ASC, [TransactionDate] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, DATA_COMPRESSION = PAGE) ON [DB_Data] END IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[sales].[DF_Transactions_Units]') AND type = 'D') BEGIN ALTER TABLE [sales].[Transactions] ADD CONSTRAINT [DF_Transactions_Units] DEFAULT ((0)) FOR [Units] END GO IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[sales].[DF_Transactions_ISOCurrencyCode]') AND type = 'D') BEGIN ALTER TABLE [sales].[Transactions] ADD CONSTRAINT [DF_Transactions_ISOCurrencyCode] DEFAULT ('USD') FOR [ISOCurrencyCode] END GO IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[sales].[DF_Transactions_InsertDate]') AND type = 'D') BEGIN ALTER TABLE [sales].[Transactions] ADD CONSTRAINT [DF_Transactions_InsertDate] DEFAULT (sysdatetime()) FOR [InsertDate] END GO IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[sales].[DF_Transactions_UpdateDate]') AND type = 'D') BEGIN ALTER TABLE [sales].[Transactions] ADD CONSTRAINT [DF_Transactions_UpdateDate] DEFAULT (sysdatetime()) FOR [UpdateDate] END GO
臨時表
same columns as the mgdata. including the relevant fields. Also has a non-unique clustered index ( [BusinessTransactionID] [varchar](150) NULL, [BusinessUserID] [varchar](150) NULL, [PostalCode] [varchar](25) NULL, [TransactionDate] [datetime2](2) NULL, [Units] [int] NOT NULL, [StartDate] [datetime2](2) NULL, [EndDate] [datetime2](2) NULL, [TransactionID] [bigint] NULL, [ClientID] [int] NULL, ) CREATE CLUSTERED INDEX ##workingTransactionsMG_idx ON #workingTransactions (TransactionID) It is populated in batches (500k rows at a time), something like this IF OBJECT_ID(N'tempdb.dbo.#workingTransactions') IS NOT NULL DROP TABLE #workingTransactions; select fields into #workingTransactions from import.Transactions where importrowid between two number ranges -- pseudocode
首要的關鍵
CONSTRAINT [PK_Transactions_TransactionID] PRIMARY KEY CLUSTERED ( [TransactionID] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, DATA_COMPRESSION=PAGE) ON [Data] ) ON [Data]
非聚集索引
CREATE NONCLUSTERED INDEX [NCI_Transactions_ClientID_TransactionDate] ON [sales].[Transactions] ( [ClientID] ASC, [TransactionDate] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, DATA_COMPRESSION = PAGE)
樣本更新聲明
-- updates every field update t set t.transactionid = s.transactionid, t.[CityCode]=s.[CityCode], t.TransactionDate=s.[TransactionDate], t.[ClientID]=s.[ClientID], t.[PackageMonths] = s.[PackageMonths], t.UpdateDate = @UpdateDate FROM #workingTransactions s JOIN [DB].[sales].[Transactions] t ON s.[TransactionID] = t.[TransactionID] WHERE CAST(HASHBYTES('SHA2_256 ',CONCAT( S.[BusinessTransactionID],'|',S.[BusinessUserID],'|', etc) <> CAST(HASHBYTES('SHA2_256 ',CONCAT( T.[BusinessTransactionID],'|',T.[BusinessUserID],'|', etc)
我的問題是,引擎蓋下發生了什麼?解決方案是什麼?作為參考,上面的連結提到了這一點:
在這一點上,我有幾個理論:
- 與記憶體壓力或大型並行更新計劃相關的錯誤,但我預計會出現不同類型的錯誤,到目前為止,我無法將低資源與這些孤立和零星錯誤的時間範圍相關聯。
- UPDATE 語句或數據中的錯誤導致主鍵上的實際重複違規,但一些模糊的 SQL Server 錯誤導致錯誤消息引用錯誤的索引名稱。
- 由讀取未送出隔離導致的髒讀取導致大量並行更新以雙重插入。但是 ETL 開發人員聲稱使用了預設的已送出讀,並且很難準確地確定程序在執行時實際使用的隔離級別。
我懷疑如果我調整執行計劃作為解決方法,可能是 MAXDOP (1) 提示或使用會話跟踪標誌來禁用假離線操作,錯誤就會消失,但目前尚不清楚這將如何影響性能
版本
Microsoft SQL Server 2017 (RTM-CU13) (KB4466404) - 14.0.3048.4 (X64) 2018 年 11 月 30 日 12:57:58 版權所有 (C) 2017 Microsoft Corporation Enterprise Edition (64-bit) on Windows Server 2016 Standard 10.0 (Build 14393) :)
我的問題是,引擎蓋下發生了什麼?解決方案是什麼?
這是一個錯誤。問題是它只是偶爾發生,並且很難重現。不過,您最好的機會是獲得 Microsoft 支持。更新處理非常複雜,因此需要進行非常詳細的調查。
有關所涉及的複雜性的範例,請查看我的文章MERGE Bug with Filtered Indexes和Incorrect Results with Indexed Views。這些都與您的問題沒有直接關係,但它們確實給人一種感覺。
編寫確定性更新
當然,這都是相當籠統的。也許更有用的是,我可以說你應該重寫你目前的
UPDATE
陳述。正如文件所說:> > 指定 FROM 子句以提供更新操作的條件時要小心。如果 UPDATE 語句包含未指定的 FROM 子句,則該語句的結果是未定義的,即,如果 UPDATE 語句不是確定性的,則每個更新的列出現只有一個值可用。 > > >
你
UPDATE
不是確定性的,因此結果是不確定的。您應該更改它,以便為每個目標行標識最多一個源行。如果沒有該更改,更新的結果可能不會反映任何單獨的源行。例子
讓我向您展示一個範例,使用根據問題中給出的表格鬆散地建模的表格:
CREATE TABLE dbo.Transactions ( TransactionID bigint NOT NULL, ClientID integer NOT NULL, TransactionDate datetime2(2) NOT NULL, CONSTRAINT PK_dbo_Transactions PRIMARY KEY CLUSTERED (TransactionID), INDEX dbo_Transactions_ClientID_TranDate (ClientID, TransactionDate) ); CREATE TABLE #Working ( TransactionID bigint NULL, ClientID integer NULL, TransactionDate datetime2(2) NULL, INDEX cx CLUSTERED (TransactionID) );
為簡單起見,在目標表中放置一行,在源表中放置四行:
INSERT dbo.Transactions (TransactionID, ClientID, TransactionDate) VALUES (1, 1, '2019-01-01'); INSERT #Working (TransactionID, ClientID, TransactionDate) VALUES (1, 2, NULL), (1, NULL, '2019-03-03'), (1, 3, NULL), (1, NULL, '2019-02-02');
所有四個源行都匹配目標 on
TransactionID
,所以如果我們執行單獨連接的更新(如問題中的那個),將TransactionID
使用哪一個?UPDATE T SET T.TransactionID = W.TransactionID, T.ClientID = W.ClientID, T.TransactionDate = W.TransactionDate FROM #Working AS W JOIN dbo.Transactions AS T ON T.TransactionID = W.TransactionID;
(更新
TransactionID
專欄對於demo來說並不重要,喜歡的可以註釋掉。)第一個驚喜是
UPDATE
完成沒有錯誤,儘管目標表不允許任何列中的空值(所有候選行都包含空值)。重要的一點是結果是undefined,在這種情況下會產生一個與任何源行都不匹配的結果:
SELECT T.TransactionID, T.ClientID, T.TransactionDate FROM dbo.Transactions AS T;
╔═══════════════╦══════════╦════════════════════════╗ ║ TransactionID ║ ClientID ║ TransactionDate ║ ╠═══════════════╬══════════╬════════════════════════╣ ║ 1 ║ 2 ║ 2019-03-03 00:00:00.00 ║ ╚═══════════════╩══════════╩════════════════════════╝
更多詳細資訊:ANY 聚合已損壞
更新應該寫成,如果寫成等效
MERGE
語句,它會成功,它會檢查多次更新同一目標行的嘗試。我一般不建議MERGE
直接使用,因為它已經受到很多實現錯誤的影響,通常性能更差。作為獎勵,您可能會發現將目前更新重寫為確定性會導致偶爾出現的錯誤問題也消失。當然,對於編寫非確定性更新的人來說,產品錯誤仍然存在。