Sql-Server

SQL Server:跨列合併和更新多行

  • November 14, 2017

我正在嘗試將數據合併到一個表中並將其合併到另一個表中。

目標表中的數據如下:

name                |dob                        |city   |occupation
-----------------------------------------------------------------------------------
galileo-galilei     |1900-01-01 00:00:00.000    |rome   |polymath
issac-newton        |1900-01-01 00:00:00.000    |london |mathematician-scientist
leonardo-da-cinci   |1900-01-03 00:00:00.000    |rome   |polymath

目標源表中的數據是:

sl_no   |name               |dob                        |city   |occupation
-----------------------------------------------------------------------------
1       |galileo-galilei    |1900-01-01 00:00:00.000    |       |
2       |galileo-galilei    |1900-01-02 00:00:00.000    |venice |
3       |galileo-galilei    |1900-01-05 00:00:00.000    |       |astronomer

目標表中的預期結果是:

name                |dob                        |city   |occupation
-----------------------------------------------------------------------------------
galileo-galilei     |1900-01-05 00:00:00.000    |venice |astronomer
issac-newton        |1900-01-01 00:00:00.000    |london |mathematician-scientist
leonardo-da-cinci   |1900-01-03 00:00:00.000    |rome   |polymath

我的嘗試使用update-with-joinmerge沒有成功。

更新加入:

-- updates data from the first match only
update p
set p.city = s.city,
p.occupation = s.occupation
from person_update_with_join_test_primary p, person_update_with_join_test_secondary s
where p.name = s.name ;

合併:

-- https://technet.microsoft.com/en-us/library/bb522522(v=sql.105).aspx
/*
The MERGE statement attempted to UPDATE or DELETE the same row more than once. 
This happens when a target row matches more than one source row. 
A MERGE statement cannot UPDATE/DELETE the same row of the target table multiple times. 
Refine the ON clause to ensure a target row matches at most one source row, or use the GROUP BY clause to group the source rows.
*/
begin
merge person_update_with_join_test_primary as p
using person_update_with_join_test_secondary as s
on (p.name = s.name)
when not matched by target 
then insert (name, dob, city, occupation) 
values (s.name, s.dob, s.city, s.occupation)
when matched 
then update set p.dob = s.dob 
, p.city=(case when (len(s.city)>0) then  s.city else p.city end)
, p.occupation=(case when (len(s.occupation)>0) then  s.occupation else p.occupation end)
output $action, inserted.*, deleted.*;
end

我相信我正在尋找的內容與此處此處發布的內容相似。然而,它並不完全是我想要的。

除了使用游標和 upsert(假設有效)之外,還有其他方法可以實現這一點嗎?

更新#1:

基本上只要源中的值不為空,源中的最新值(具有最高 id 值)預計將被合併到目標中。

例如:對於 中的#3 行source,該city列不會被視為合併到目標中。與 #2 類似,occupation 不會考慮將該列合併到目標中。該列namedestination表中的主鍵。

我試圖在destination表中實現與預期相同的狀態,如果我要迭代source數據並僅更新 - 中的非空值,則destination使用查詢而不是通過應用程序進行。

如果您只有一行填充了城市和職業列,您可以使用視窗函式來實現它:

例如:

DECLARE @Source TABLE(
   sl_no       INT
   ,name       NVARCHAR(30)
   ,dob        DATETIME2(3)
   ,city       NVARCHAR(30)
   ,occupation NVARCHAR(30)
);

INSERT INTO @Source
VALUES
   (1, 'galileo-galilei', '1900-01-01 00:00:00.000', NULL, NULL),
   (2, 'galileo-galilei', '1900-01-02 00:00:00.000', 'venice', NULL),
   (3, 'galileo-galilei', '1900-01-05 00:00:00.000', NULL, 'astronomer'),
   (4, 'issac-newton',    '1900-01-01 00:00:00.000', 'london', 'mathematician-scientist')

SELECT DISTINCT
   name
   ,MAX(dob)           OVER(PARTITION BY name) AS dob
   ,MAX(city)          OVER(PARTITION BY name) AS city
   ,MAX(occupation)    OVER(PARTITION BY name) AS occupation
FROM 
   @Source

但是,我懷疑現實情況是您可能有多個記錄,並且您總是希望從在這些列中包含數據的最新記錄中返回值。例如,如果您的來源是:

DECLARE @Source TABLE(
   sl_no       INT
   ,name       NVARCHAR(30)
   ,dob        DATETIME2(3)
   ,city       NVARCHAR(30)
   ,occupation NVARCHAR(30)
);

INSERT INTO @Source
VALUES
   (1, 'galileo-galilei', '1900-01-01 00:00:00.000', 'rome', NULL),
   (2, 'galileo-galilei', '1900-01-02 00:00:00.000', 'venice', NULL),
   (3, 'galileo-galilei', '1900-01-05 00:00:00.000', NULL, 'astronomer'),
   (4, 'issac-newton',    '1900-01-01 00:00:00.000', 'london', 'mathematician-scientist')

您可以通過以下方式實現您想要的:

SELECT
   s.name
   ,s.dob
   ,sc.city
   ,so.occupation
FROM
   @Source AS s
   CROSS APPLY(
       SELECT TOP 1 city
       FROM @Source AS s2
       WHERE s2.name = s.name
       AND city IS NOT NULL
       ORDER BY sl_no DESC
       ) AS sc
   CROSS APPLY(
       SELECT TOP 1 occupation
       FROM @Source AS s3
       WHERE s3.name = s.name
       AND occupation IS NOT NULL
       ORDER BY sl_no DESC
   ) AS so
WHERE
   s.sl_no = (SELECT MAX(sl_no) FROM @Source AS s4 WHERE s4.name = s.name)

將其包裝到合併或更新中(我將為您進行合併),您將獲得:

WITH src AS (
   SELECT
       s.name
       ,s.dob
       ,sc.city
       ,so.occupation
   FROM
       @Source AS s
       CROSS APPLY(
           SELECT TOP 1 city
           FROM @Source AS s2
           WHERE s2.name = s.name
           AND city IS NOT NULL
           ORDER BY sl_no DESC
           ) AS sc
       CROSS APPLY(
           SELECT TOP 1 occupation
           FROM @Source AS s3
           WHERE s3.name = s.name
           AND occupation IS NOT NULL
           ORDER BY sl_no DESC
       ) AS so
   WHERE
       s.sl_no = (SELECT MAX(sl_no) FROM @Source AS s4 WHERE s4.name = s.name)
)
MERGE INTO Destination AS tgt
USING tgt.name = src.name

WHEN MATCHED THEN UPDATE
SET dob = src.dob
   ,city = src.city
   ,occupation = src.occupation

WHEN NOT MATCHED THEN INSERT(name, dob, city, occupation)
VALUES(src.name, src.dob, src.city, src.occuptaion);

您將要一直索引您加入的列(上面範例中的名稱)以提高性能。否則你會得到很多掃描。

引用自:https://dba.stackexchange.com/questions/190819