Sql-Server

SQL 多選與 where 條件的表現真的很差

  • January 9, 2018

資料結構是這樣的:

Number| ID | DateTime | Status

50000001, 101, 1/1/09 0:10, PO
50000001, 110, 1/1/09 0:11, PO
50000001, 102, 1/1/09 0:15, PO
50000001, 101, 1/1/09 0:10, PCK 
50000001, 102, 1/1/09 0:12, PCK 
50000001, 110, 1/1/09 0:12, PCK 
50000001, 101, 1/1/09 0:15, C 
50000001, 101, 1/1/09 0:15, C+
50000001, 110, 1/1/09 0:15, C 
50000001, 110, 1/1/09 0:15, C+
50000001, 102, 1/1/09 0:15, C 
50000001, 102, 1/1/09 0:15, C+
50000002, 126, 1/1/09 0:13, WO
50000002, 126, 1/1/09 0:14, PCK 
50000002, 126, 1/1/09 0:14, C 
50000002, 126, 1/1/09 0:14, S

我正在嘗試根據 Status 列中的幾個值選擇數據並將其返回格式如下:

| Number | OrderOpen | OrderCleared |

我的查詢如下所示:

Select 
SetOpen.ID, SetOpen.Num, SetOpen.OrderOpen, SetCleared.OrderCleared
FROM 
(
SELECT
   order_status_hist.ID AS 'ID',
   order_status_hist.Number AS 'Num',
   order_status_hist.datetime AS 'OrderOpen' 
FROM 
   dbo.order_status_hist
WHERE 
   order_status_hist.Status='WO' OR 
   order_status_hist.Status='PO'
) as SetOpen
inner JOIN
(
SELECT
   order_status_hist.ID AS 'ID',
   order_status_hist.Number AS 'Num',
   order_status_hist.datetime AS 'OrderCleared' 
FROM 
   order_status_hist
WHERE 
(
   order_status_hist.Status='C+' OR 
   order_status_hist.Status='S' 
)
) AS SetCleared

ON SetOpen.ID = SetCleared.ID

在我的生產數據庫中,狀態列中有大約 50 個值,而我的 where 語句在第一個中有 10 個“案例”,在第二個中有 5 個。當我執行它的生產版本時,需要很長時間才能完成。更麻煩的是,這實際上只是更大查詢的一部分,我將在“數字”上加入這些結果以將數據提取到更多表中。

鑑於此堆:

CREATE TABLE #foo(Number int, ID tinyint, [DateTime] datetime, [Status] varchar(10));

INSERT #foo(Number,ID,[DateTime],[Status]) VALUES
(50000001, 101, '1/1/2009 0:10', 'PO'),
(50000001, 110, '1/1/2009 0:11', 'PO'),
(50000001, 102, '1/1/2009 0:15', 'PO'),
(50000001, 101, '1/1/2009 0:10', 'PCK'), 
(50000001, 102, '1/1/2009 0:12', 'PCK'), 
(50000001, 110, '1/1/2009 0:12', 'PCK'), 
(50000001, 101, '1/1/2009 0:15', 'C'),
(50000001, 101, '1/1/2009 0:15', 'C+'),
(50000001, 110, '1/1/2009 0:15', 'C'),
(50000001, 110, '1/1/2009 0:15', 'C+'),
(50000001, 102, '1/1/2009 0:15', 'C'),
(50000001, 102, '1/1/2009 0:15', 'C+'),
(50000002, 126, '1/1/2009 0:13', 'WO'),
(50000002, 126, '1/1/2009 0:14', 'PCK'), 
(50000002, 126, '1/1/2009 0:14', 'C'),
(50000002, 126, '1/1/2009 0:14', 'S');

此查詢將其從兩次掃描降至一次掃描,擺脫雜湊匹配,並將大部分成本轉移到排序:

SELECT
 ID, Num = Number,
 OrderOpen    = MIN(CASE WHEN [Status] IN ('WO','PO') THEN [DateTime] END),
 OrderCleared = MAX(CASE WHEN [Status] IN ('S','C+')  THEN [DateTime] END)
FROM #foo
GROUP BY ID, Number;

這個 PIVOT 也更便宜,但仍然有昂貴的類型:

SELECT * FROM 
(
 SELECT ID, Number, [DateTime], BetterStatus = CASE 
   WHEN [Status] IN ('WO','PO') THEN 'OrderOpen'
   WHEN [Status] IN ('S', 'C+') THEN 'OrderCleared' END
 FROM #foo
) AS f
PIVOT
(
 MAX([Datetime]) FOR BetterStatus IN ([OrderOpen],[OrderCleared])
) AS p;
GO

但是,如果我們添加計算列和索引:

ALTER TABLE #foo ADD BetterStatus AS CONVERT(varchar(12), 
 ISNULL(CASE WHEN [Status] IN ('WO','PO') THEN 'OrderOpen'
      WHEN [Status] IN ('S','C+')  THEN 'OrderCleared' END, '?'));
GO
CREATE INDEX x ON #foo(Number, ID, [Datetime], BetterStatus);
GO

然後這兩個查詢的表現要好得多(都產生有序掃描,無需任何額外的排序):

SELECT
 ID, Num = Number,
 OrderOpen    = MIN(CASE [BetterStatus] WHEN 'OrderOpen' THEN [DateTime] END),
 OrderCleared = MAX(CASE [BetterStatus] WHEN 'OrderCleared' THEN [DateTime] END)
FROM #foo
GROUP BY ID, Number;

SELECT ID, Num, OrderOpen, OrderCleared
FROM 
(
 SELECT ID, Num = Number, [DateTime], BetterStatus FROM #foo
) AS f
PIVOT
(
 MAX([Datetime]) FOR BetterStatus IN ([OrderOpen],[OrderCleared])
) AS p;

因此,如果您沒有支持索引和/或無法添加計算列或新索引,請使用前兩個查詢之一,並且如果您可以添加這些項目(或修改現有索引以達到相同的效果),使用後兩者之一。

當然,這不是詳盡的調整。只是一個啟動。

引用自:https://dba.stackexchange.com/questions/164488