在完全外連接中對多個列進行排序,可能包含 NULL
我正在嘗試將計劃與歷史記錄進行比較,以查看是否省略了步驟或是否包含了計劃外的步驟。正常的、計劃好的歷史項目以及異常情況旨在用於詳細報告。問題是,我在每個連接表中都有一列我需要進行排序,但由於
full outer join
我不能保證它們是not null
,這會破壞排序。我將繼續一個簡化的範例,其中計劃對應於一個食譜:Table: RecipeSteps Id | RecipeId | Position | Message ----+----------+----------+---------------------------- 50 | 123 | 1 | Wash vegetables 51 | 123 | 2 | Peel vegetables 52 | 123 | 3 | Cut meat 53 | 123 | 4 | Turn on stove 54 | 123 | 5 | Cook and stir it 55 | 123 | 6 | Turn off stove ----+----------+----------+----------------------------
…以及烹飪時實際步驟的歷史:
Table: History Id | TimeStamp | Session | StepId | Message ----+-----------+---------+--------+----------------------------- 90 | 00:01:00 | 321 | NULL | Chef entered kitchen 91 | 00:02:00 | 321 | 51 | Chef peeled vegetables 92 | 00:03:00 | 321 | 52 | Chef cut meat 93 | 00:04:00 | 321 | NULL | Chef picked his nose 94 | 00:05:00 | 321 | 53 | Chef turned on stove 95 | 00:06:00 | 321 | 54 | Chef started cooking 96 | 00:10:00 | 321 | NULL | Chef left kitchen 97 | 01:00:00 | 321 | NULL | FIRE! ----+-----------+---------+--------+-----------------------------
現在,我想將結果表中的這些表與以下
full outer join
類似的表進行比較:Table: Result Sorting | TimeStamp | Position | Message | Comment --------+-----------+----------+------------------------+--------------- 1 | 00:01:00 | NULL | Chef entered kitchen | unplanned 2 | NULL | 1 | Wash vegetables | omitted 3 | 00:02:00 | 2 | Chef peeled vegetables | planned 4 | 00:03:00 | 3 | Chef cut meat | planned 5 | 00:04:00 | NULL | Chef picked his nose | unplanned 6 | 00:05:00 | 4 | Chef turned on stove | planned 7 | 00:06:00 | 5 | Chef started cooking | planned 8 | NULL | 6 | Turn off stove | omitted 9 | 00:10:00 | NULL | Chef left kitchen | unplanned 10 | 01:00:00 | NULL | FIRE! | unplanned --------+-----------+----------+------------------------+---------------
然後報告的讀者會看到:
- 進入/離開廚房不是食譜的一部分,但無論如何都被記錄下來了。
- 廚師忘記洗蔬菜(呃!)或由於某種原因沒有記錄步驟(呸!)。
- 歷史按計劃繼續,除了廚師在兩者之間挖了鼻子(雙重呃!)
- 最後廚師忘記關爐子,導致著火。
我想出了以下
SELECT
語句,從這個查詢中選擇ORDER BY Sorting ASC
,但我仍然有一個問題來填寫Sorting
標準:SELECT 0 Sorting, -- this has to be replaced h.TimeStamp, r.Position, '...' Message, -- case-when-construct, not important for the question '...' Comment -- case-when-construct, not important for the question FROM RecipeSteps r FULL OUTER JOIN History h ON h.StepId = r.Id WHERE (r.RecipeId = 123 OR r.RecipeId IS NULL) AND (h.Session = 321 OR h.Session IS NULL)
時間戳給出的歷史和列給出的配方中有一個自然順序
Position
。由於它們永遠不會null
同時存在,因此我可以基於這些創建Sorting
列,但到目前為止,我無法弄清楚如何實現這一點。我知道我可以以程式方式分別對兩者進行排序、列舉和比較兩個序列等,但我很好奇這是否也適用於非過程 SQL。
請注意:為了簡單起見,下面的查詢沒有考慮分區
Session
。主要思想是將數據集分成兩組,分別處理。
History
先走。我決定讓History
比 更重要Steps
。如果交換了某些步驟,它們將按歷史順序顯示。
History
因此,第一組是來自相應步驟的所有行加上可能的詳細資訊。這是一個簡單的LEFT JOIN
.第二組省略了步驟。另一個
LEFT JOIN
帶過濾器。如果您按時間戳執行第一個
CTE_History
排序,您會看到計劃外事件有NULL
Position
,但由於它們的時間戳,它們將在結果集中位於正確的位置。我們需要做的就是填寫這些空白。+-------------+---------------------+----------+------------------------+-----------+ | NewPosition | ts | Position | NewMessage | Comment | +-------------+---------------------+----------+------------------------+-----------+ | 0 | 2017-01-01 00:01:00 | NULL | Chef entered kitchen | unplanned | | 2 | 2017-01-01 00:02:00 | 2 | Chef peeled vegetables | planned | | 3 | 2017-01-01 00:03:00 | 3 | Chef cut meat | planned | | 3 | 2017-01-01 00:04:00 | NULL | Chef picked his nose | unplanned | | 4 | 2017-01-01 00:05:00 | 4 | Chef turned on stove | planned | | 5 | 2017-01-01 00:06:00 | 5 | Chef started cooking | planned | | 5 | 2017-01-01 00:10:00 | NULL | Chef left kitchen | unplanned | | 5 | 2017-01-01 01:00:00 | NULL | FIRE! | unplanned | +-------------+---------------------+----------+------------------------+-----------+
一種方法是從前幾行中獲取最後一個非空值。不幸的是,SQL Server 沒有
IGNORE NULLS
在函式中實現子句LAST_VALUE
,所以我們必須使用一種變通方法並依賴於隨著增長而增長MAX
的事實。在更一般的情況下,它會更複雜一些,正如 Itzik Ben-Gan 在他的 The Last non NULL Puzzle中所展示的那樣。Position``ts
第一個
NULL
更改為0
(或小於任何可能位置的某個數字)。一旦我們有了所有位置的值,將這個結果與所有省略的步驟合併並對其進行排序就很簡單了。
在這種方法中,所有省略的步驟都會在不明確的情況下出現在計劃外的步驟之後,因此
Turn off stove
將出現在最後。如果您選擇計算MIN
後面的行而不是MAX
前面的行,則該規則將被交換。樣本數據
DECLARE @RecipeSteps TABLE (Id int, RecipeId int, Position int, Message nvarchar(255)); INSERT INTO @RecipeSteps (Id, RecipeId, Position, Message) VALUES (50, 123, 1, 'Wash vegetables '), (51, 123, 2, 'Peel vegetables '), -- (52, 123, 3, 'Cut meat '), -- (53, 123, 4, 'Turn on stove '), -- (54, 123, 5, 'Cook and stir it'), -- (55, 123, 6, 'Turn off stove '); DECLARE @History TABLE (Id int, ts datetime2(0), Session int, StepId int, Message nvarchar(255)); INSERT INTO @History (Id, ts, Session, StepId, Message) VALUES (90, '2017-01-01 00:01:00', 321, NULL, 'Chef entered kitchen '), (91, '2017-01-01 00:02:00', 321, 51, 'Chef peeled vegetables'), (92, '2017-01-01 00:03:00', 321, 52, 'Chef cut meat '), (93, '2017-01-01 00:04:00', 321, NULL, 'Chef picked his nose '), (94, '2017-01-01 00:05:00', 321, 53, 'Chef turned on stove '), (95, '2017-01-01 00:06:00', 321, 54, 'Chef started cooking '), (96, '2017-01-01 00:10:00', 321, NULL, 'Chef left kitchen '), (97, '2017-01-01 01:00:00', 321, NULL, 'FIRE! ');
詢問
WITH CTE_History AS ( SELECT ISNULL( MAX(Position) OVER (ORDER BY ts ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) , 0) AS NewPosition ,H.ts ,S.Position ,H.Message AS NewMessage ,CASE WHEN S.Id IS NULL THEN 'unplanned' ELSE 'planned' END AS Comment FROM @History AS H LEFT JOIN @RecipeSteps AS S ON S.Id = H.StepId ) ,CTE_OmittedSteps AS ( SELECT Position AS NewPosition ,H.ts ,S.Position ,S.Message AS NewMessage ,'omitted' AS Comment FROM @RecipeSteps AS S LEFT JOIN @History AS H ON S.Id = H.StepId WHERE H.Id IS NULL ) SELECT * FROM CTE_History UNION ALL SELECT * FROM CTE_OmittedSteps ORDER BY NewPosition, ts;
結果
+-------------+---------------------+----------+------------------------+-----------+ | NewPosition | ts | Position | NewMessage | Comment | +-------------+---------------------+----------+------------------------+-----------+ | 0 | 2017-01-01 00:01:00 | NULL | Chef entered kitchen | unplanned | | 1 | NULL | 1 | Wash vegetables | omitted | | 2 | 2017-01-01 00:02:00 | 2 | Chef peeled vegetables | planned | | 3 | 2017-01-01 00:03:00 | 3 | Chef cut meat | planned | | 3 | 2017-01-01 00:04:00 | NULL | Chef picked his nose | unplanned | | 4 | 2017-01-01 00:05:00 | 4 | Chef turned on stove | planned | | 5 | 2017-01-01 00:06:00 | 5 | Chef started cooking | planned | | 5 | 2017-01-01 00:10:00 | NULL | Chef left kitchen | unplanned | | 5 | 2017-01-01 01:00:00 | NULL | FIRE! | unplanned | | 6 | NULL | 6 | Turn off stove | omitted | +-------------+---------------------+----------+------------------------+-----------+
查詢 SQL Server 2008
MAX
在這裡,我用我放入的子查詢替換了視窗函式OUTER APPLY
。它產生與上面相同的結果。WITH CTE_HistoryRaw AS ( SELECT H.ts ,S.Position ,H.Message AS NewMessage ,CASE WHEN S.Id IS NULL THEN 'unplanned' ELSE 'planned' END AS Comment FROM @History AS H LEFT JOIN @RecipeSteps AS S ON S.Id = H.StepId ) ,CTE_History AS ( SELECT ISNULL(A.NewPosition, 0) AS NewPosition ,ts ,Position ,NewMessage ,Comment FROM CTE_HistoryRaw AS Curr OUTER APPLY ( SELECT TOP(1) Prev.Position AS NewPosition FROM CTE_HistoryRaw AS Prev WHERE Prev.ts <= Curr.ts AND Prev.Position IS NOT NULL ORDER BY Prev.ts DESC ) AS A ) ,CTE_OmittedSteps AS ( SELECT Position AS NewPosition ,H.ts ,S.Position ,S.Message AS NewMessage ,'omitted' AS Comment FROM @RecipeSteps AS S LEFT JOIN @History AS H ON S.Id = H.StepId WHERE H.Id IS NULL ) SELECT * FROM CTE_History UNION ALL SELECT * FROM CTE_OmittedSteps ORDER BY NewPosition, ts;