“Upsert”觸發從 SSIS 寫入大量重複項
我有一個“upsert”觸發器的問題,它是由一個帶有快速載入的 SSIS 包觸發的。(啟用 FIRE_TRIGGERS)
CREATE TRIGGER [dbo].[MP_VOL_UPSERT] ON [dbo].[vwMP_VOL] INSTEAD OF INSERT AS BEGIN SET NOCOUNT ON; UPDATE [dbo].[vwMP_VOL] SET [vwMP_VOL].[id_batch] = inserted.id_batch ,[vwMP_VOL].[id_invoice] = inserted.id_invoice ,[vwMP_VOL].[text_vchr] = inserted.text_vchr ,[vwMP_VOL].[amt_frt_tcur] = inserted.amt_frt_tcur ,[vwMP_VOL].[amt_line_hcur] = inserted.amt_line_hcur ,[vwMP_VOL].[line_total_hcur] = inserted.line_total_hcur ,[vwMP_VOL].[id_tax_1] = inserted.id_tax_1 ,[vwMP_VOL].[id_tax_2] = inserted.id_tax_2 ,[vwMP_VOL].[id_tax_3] = inserted.id_tax_3 ,[vwMP_VOL].[id_tax_4] = inserted.id_tax_4 ,[vwMP_VOL].[line_total_rem_hcu] = inserted.line_total_rem_hcu ,[vwMP_VOL].[id_po] = inserted.id_po ,[vwMP_VOL].[cnt_rlse] = inserted.cnt_rlse ,[vwMP_VOL].[cnt_line_nbr] = inserted.cnt_line_nbr FROM inserted INNER JOIN (SELECT [Org] ,[id_vend] ,[id_org_orig] ,[id_vchr_nbr] ,[id_vchr_line] FROM [vwMP_VOL]) AS old ON old.[Org] = inserted.Org AND old.[id_vend] = inserted.id_vend AND old.[id_org_orig] = inserted.id_org_orig AND old.[id_vchr_nbr] = inserted.id_vchr_nbr AND old.[id_vchr_line] = inserted.id_vchr_line; INSERT INTO [dbo].[vwMP_VOL] SELECT inserted.* FROM inserted LEFT JOIN (SELECT [Org] ,[id_vend] ,[id_org_orig] ,[id_vchr_nbr] ,[id_vchr_line] FROM [vwMP_VOL]) AS old ON old.[Org] = inserted.Org AND old.[id_vend] = inserted.id_vend AND old.[id_org_orig] = inserted.id_org_orig AND old.[id_vchr_nbr] = inserted.id_vchr_nbr AND old.[id_vchr_line] = inserted.id_vchr_line WHERE old.Org IS NULL; END
插入新行可以正常工作,但所有現有行都會使用一組值進行更新。
SELECT text_vchr, count(*) as dups FROM MP_VOL group by text_vchr order by count(*) desc Results: text_vchr dups ------------------------- ----- MEDIA MONITORING SERVICES 20897 1
我是否必須在 SSIS 中禁用快速載入,還是有其他方法可以修復它?
(它是 SQL Server 2005,所以
MERGE
不可用)
我懷疑問題出現是因為您在語句中引用了兩次視圖
UPDATE
,一次是在UPDATE [dbo].[vwMP_VOL]
子句中,然後是在FROM
子句中,第二次是給它一個別名。但是您沒有在SET
子句中使用別名,僅在第一個引用中使用。結果是這兩個引用是不相關的,並且行用不相關的數據更新(可能還有很多次!)
請參閱官方 MSDN 文件
UPDATE
和注意事項:最佳實踐
…
指定 FROM 子句以提供更新操作的條件時要小心。如果 UPDATE 語句包含未指定的 FROM 子句,則該語句的結果是未定義的,即,如果 UPDATE 語句不是確定性的,則每個更新的列出現只有一個值可用。
並在 Hugo Kornelis 的部落格文章中更詳細地分析了這個“特性”和它可能導致的其他問題:**讓我們棄用
UPDATE FROM
!**尤其是這一段:正確性?呸,誰在乎?
嗯,大多數都這樣做。這就是我們測試的原因。
如果我在查詢中弄亂了連接條件
SELECT
,以至於第二個表中有太多行匹配,我會在測試後立即看到它,因為我得到的行數比預期的要多。如果我以類似的方式在 ANSI 標準查詢中弄亂子查詢條件UPDATE
,我會更快看到它,因為如果子查詢返回多個值,SQL Server 將返回錯誤。但是使用專有UPDATE FROM
語法, 我可以搞砸連接並且永遠不會注意到- SQL Server會愉快地一遍又一遍地更新同一行如果它與連接表中的多行匹配,則僅保留最後一個更新的結果。並且無法知道將是哪一行,因為這取決於恰好選擇的查詢執行計劃。最壞的情況是執行計劃恰好在單處理器開發伺服器上的所有測試中產生預期的結果——然後,在部署到四路雙核生產伺服器後,我們的寶貴數據突然擊中風扇…我認為解決此問題的最佳方法是在
UPDATE
子句中使用別名並相應地更改SET
子句:UPDATE old SET [id_batch] = inserted.id_batch ,[id_invoice] = inserted.id_invoice ,[text_vchr] = inserted.text_vchr ,[amt_frt_tcur] = inserted.amt_frt_tcur ,[amt_line_hcur] = inserted.amt_line_hcur ,[line_total_hcur] = inserted.line_total_hcur ,[id_tax_1] = inserted.id_tax_1 ,[id_tax_2] = inserted.id_tax_2 ,[id_tax_3] = inserted.id_tax_3 ,[id_tax_4] = inserted.id_tax_4 ,[line_total_rem_hcu] = inserted.line_total_rem_hcu ,[id_po] = inserted.id_po ,[cnt_rlse] = inserted.cnt_rlse ,[cnt_line_nbr] = inserted.cnt_line_nbr FROM inserted -- INNER JOIN (SELECT * FROM [dbo].[vwMP_VOL]) AS old -- or the simpler INNER JOIN [dbo].[vwMP_VOL] AS old ON old.[Org] = inserted.Org AND old.[id_vend] = inserted.id_vend AND old.[id_org_orig] = inserted.id_org_orig AND old.[id_vchr_nbr] = inserted.id_vchr_nbr AND old.[id_vchr_line] = inserted.id_vchr_line;
這也可以,但我不像以前那樣喜歡它,因為我們不能給表起別名:
UPDATE [dbo].[vwMP_VOL] SET [id_batch] = inserted.id_batch --- ,[cnt_line_nbr] = inserted.cnt_line_nbr FROM inserted WHERE [dbo].[vwMP_VOL].[Org] = inserted.Org AND [dbo].[vwMP_VOL].[id_vend] = inserted.id_vend AND [dbo].[vwMP_VOL].[id_org_orig] = inserted.id_org_orig AND [dbo].[vwMP_VOL].[id_vchr_nbr] = inserted.id_vchr_nbr AND [dbo].[vwMP_VOL].[id_vchr_line] = inserted.id_vchr_line;