Sql-Server

“Upsert”觸發從 SSIS 寫入大量重複項

  • June 8, 2016

我有一個“upsert”觸發器的問題,它是由一個帶有快速載入的 SSIS 包觸發的。(啟用 FIRE_TRIGGERS)

CREATE TRIGGER [dbo].[MP_VOL_UPSERT] 
  ON  [dbo].[vwMP_VOL] 
  INSTEAD OF INSERT
AS 
BEGIN
   SET NOCOUNT ON;

UPDATE [dbo].[vwMP_VOL]
  SET
   [vwMP_VOL].[id_batch] = inserted.id_batch
     ,[vwMP_VOL].[id_invoice] = inserted.id_invoice
     ,[vwMP_VOL].[text_vchr] = inserted.text_vchr
     ,[vwMP_VOL].[amt_frt_tcur] = inserted.amt_frt_tcur
     ,[vwMP_VOL].[amt_line_hcur] = inserted.amt_line_hcur
     ,[vwMP_VOL].[line_total_hcur] = inserted.line_total_hcur
     ,[vwMP_VOL].[id_tax_1] = inserted.id_tax_1
     ,[vwMP_VOL].[id_tax_2] = inserted.id_tax_2
     ,[vwMP_VOL].[id_tax_3] = inserted.id_tax_3
     ,[vwMP_VOL].[id_tax_4] = inserted.id_tax_4
     ,[vwMP_VOL].[line_total_rem_hcu] = inserted.line_total_rem_hcu
     ,[vwMP_VOL].[id_po] = inserted.id_po
     ,[vwMP_VOL].[cnt_rlse] = inserted.cnt_rlse
     ,[vwMP_VOL].[cnt_line_nbr] = inserted.cnt_line_nbr
FROM inserted 
INNER JOIN (SELECT [Org]
     ,[id_vend]
     ,[id_org_orig]
     ,[id_vchr_nbr]
     ,[id_vchr_line]
 FROM [vwMP_VOL]) AS old
ON old.[Org] = inserted.Org
     AND old.[id_vend] = inserted.id_vend
     AND old.[id_org_orig] = inserted.id_org_orig
     AND old.[id_vchr_nbr] = inserted.id_vchr_nbr
     AND old.[id_vchr_line] = inserted.id_vchr_line;

INSERT INTO [dbo].[vwMP_VOL] 
SELECT inserted.* FROM inserted 
LEFT JOIN (SELECT [Org]
     ,[id_vend]
     ,[id_org_orig]
     ,[id_vchr_nbr]
     ,[id_vchr_line]
 FROM [vwMP_VOL]) AS old
ON old.[Org] = inserted.Org
     AND old.[id_vend] = inserted.id_vend
     AND old.[id_org_orig] = inserted.id_org_orig
     AND old.[id_vchr_nbr] = inserted.id_vchr_nbr
     AND old.[id_vchr_line] = inserted.id_vchr_line
WHERE old.Org IS NULL;

END

插入新行可以正常工作,但所有現有行都會使用一組值進行更新。

SELECT     
text_vchr, count(*) as dups
FROM MP_VOL
group by text_vchr
order by count(*) desc

Results:
text_vchr                   dups
-------------------------   -----
MEDIA MONITORING SERVICES   20897
                               1

我是否必須在 SSIS 中禁用快速載入,還是有其他方法可以修復它?

(它是 SQL Server 2005,所以MERGE不可用)

我懷疑問題出現是因為您在語句中引用了兩次視圖UPDATE,一次是在UPDATE [dbo].[vwMP_VOL]子句中,然後是在FROM子句中,第二次是給它一個別名。但是您沒有在SET子句中使用別名,僅在第一個引用中使用。

結果是這兩個引用是不相關的,並且行用不相關的數據更新(可能還有很多次!)

請參閱官方 MSDN 文件UPDATE和注意事項:

最佳實踐

指定 FROM 子句以提供更新操作的條件時要小心。如果 UPDATE 語句包含未指定的 FROM 子句,則該語句的結果是未定義的,即,如果 UPDATE 語句不是確定性的,則每個更新的列出現只有一個值可用。

並在 Hugo Kornelis 的部落格文章中更詳細地分析了這個“特性”和它可能導致的其他問題:**讓我們棄用UPDATE FROM**尤其是這一段:

正確性?呸,誰在乎?

嗯,大多數都這樣做。這就是我們測試的原因。

如果我在查詢中弄亂了連接條件SELECT,以至於第二個表中有太多行匹配,我會在測試後立即看到它,因為我得到的行數比預期的要多。如果我以類似的方式在 ANSI 標準查詢中弄亂子查詢條件UPDATE,我會更快看到它,因為如果子查詢返回多個值,SQL Server 將返回錯誤。但是使用專有UPDATE FROM語法, 我可以搞砸連接並且永遠不會注意到- SQL Server會愉快地一遍又一遍地更新同一行如果它與連接表中的多行匹配,則僅保留最後一個更新的結果。並且無法知道將是哪一行,因為這取決於恰好選擇的查詢執行計劃。最壞的情況是執行計劃恰好在單處理器開發伺服器上的所有測試中產生預期的結果——然後,在部署到四路雙核生產伺服器後,我們的寶貴數據突然擊中風扇…


我認為解決此問題的最佳方法是在UPDATE子句中使用別名並相應地更改SET子句:

UPDATE old 
  SET
      [id_batch] = inserted.id_batch
     ,[id_invoice] = inserted.id_invoice
     ,[text_vchr] = inserted.text_vchr
     ,[amt_frt_tcur] = inserted.amt_frt_tcur
     ,[amt_line_hcur] = inserted.amt_line_hcur
     ,[line_total_hcur] = inserted.line_total_hcur
     ,[id_tax_1] = inserted.id_tax_1
     ,[id_tax_2] = inserted.id_tax_2
     ,[id_tax_3] = inserted.id_tax_3
     ,[id_tax_4] = inserted.id_tax_4
     ,[line_total_rem_hcu] = inserted.line_total_rem_hcu
     ,[id_po] = inserted.id_po
     ,[cnt_rlse] = inserted.cnt_rlse
     ,[cnt_line_nbr] = inserted.cnt_line_nbr
FROM inserted 
         --   INNER JOIN (SELECT * FROM [dbo].[vwMP_VOL]) AS old
-- or the simpler
INNER JOIN [dbo].[vwMP_VOL] AS old
ON old.[Org] = inserted.Org
     AND old.[id_vend] = inserted.id_vend
     AND old.[id_org_orig] = inserted.id_org_orig
     AND old.[id_vchr_nbr] = inserted.id_vchr_nbr
     AND old.[id_vchr_line] = inserted.id_vchr_line;

這也可以,但我不像以前那樣喜歡它,因為我們不能給表起別名:

UPDATE [dbo].[vwMP_VOL] 
  SET
      [id_batch] = inserted.id_batch
     ---
     ,[cnt_line_nbr] = inserted.cnt_line_nbr
FROM inserted 
WHERE [dbo].[vwMP_VOL].[Org] = inserted.Org
     AND [dbo].[vwMP_VOL].[id_vend] = inserted.id_vend
     AND [dbo].[vwMP_VOL].[id_org_orig] = inserted.id_org_orig
     AND [dbo].[vwMP_VOL].[id_vchr_nbr] = inserted.id_vchr_nbr
     AND [dbo].[vwMP_VOL].[id_vchr_line] = inserted.id_vchr_line;

引用自:https://dba.stackexchange.com/questions/140692