T-Sql
SQL Challenge - 感測器門檻值異常報告
我添加了一個不使用視窗函式的解決方案和一個帶有大量數據集的基準測試,低於 Martin’s Answer
這是GROUP BY 的後續執行緒,使用不在 SELECT 列表中的列 - 這什麼時候實用、優雅或強大?
在我對這個挑戰的解決方案中,我使用了一個查詢,該查詢按不屬於選擇列表的表達式進行分組。當邏輯分組元素涉及來自其他行的數據時,這經常與視窗函式一起使用。
舉例來說,也許這是一個矯枉過正的例子,但我認為您可能會發現挑戰本身就很有趣。我會等待發布我的解決方案,也許你們中的一些人可以提出更好的解決方案。
挑戰
我們有一個定期記錄讀數值的感測器表。不能保證採樣時間是單調的。
您需要編寫一個報告“異常”的查詢,這意味著感測器報告超出門檻值的讀數(低或高)的時間。感測器報告超過或低於門檻值的每個時間段都被視為“異常”。一旦讀數恢復正常,異常就結束了。
範例表和數據
該腳本採用 T-SQL 編寫,是我的培訓材料的一部分。
------------------------------------------ -- Sensor Thresholds - 1 - Setup Example -- ------------------------------------------ CREATE TABLE [Sensors] ( [Sensor] NVARCHAR(10) NOT NULL, [Lower Threshold] DECIMAL(7,2) NOT NULL, [Upper Threshold] DECIMAL(7,2) NOT NULL, CONSTRAINT [PK Sensors] PRIMARY KEY CLUSTERED ([Sensor]), CONSTRAINT [CK Value Range] CHECK ([Upper Threshold] > [Lower Threshold]) ); GO INSERT INTO [Sensors] ( [Sensor] , [Lower Threshold] , [Upper Threshold] ) VALUES (N'Sensor A', -50, 50 ), (N'Sensor B', 40, 80), (N'Sensor C', 0, 100); GO CREATE TABLE [Measurements] ( [Sensor] NVARCHAR(10) NOT NULL, [Measure Time] DATETIME2(0) NOT NULL, [Measurement] DECIMAL(7,2) NOT NULL, CONSTRAINT [PK Measurements] PRIMARY KEY CLUSTERED ([Sensor], [Measure Time]), CONSTRAINT [FK Measurements Sensors] FOREIGN KEY ([Sensor]) REFERENCES [Sensors]([Sensor]) ); GO INSERT INTO [Measurements] ( [Sensor] , [Measure Time] , [Measurement] ) VALUES ( N'Sensor A', N'20160101 08:00', -9), ( N'Sensor A', N'20160101 09:00', 30), ( N'Sensor A', N'20160101 10:30', 59), ( N'Sensor A', N'20160101 23:00', 66), ( N'Sensor A', N'20160102 08:00', 48), ( N'Sensor A', N'20160102 11:30', 08), ( N'Sensor B', N'20160101 08:00', 39), -- Note that this exception range has both over and under.... ( N'Sensor B', N'20160101 10:30', 88), ( N'Sensor B', N'20160101 13:00', 75), ( N'Sensor B', N'20160102 08:00', 95), ( N'Sensor B', N'20160102 17:00', 75), ( N'Sensor C', N'20160101 09:00', 01), ( N'Sensor C', N'20160101 10:00', -1), ( N'Sensor C', N'20160101 18:00', -2), ( N'Sensor C', N'20160101 22:00', -2), ( N'Sensor C', N'20160101 23:30', -1); GO
預期結果
Sensor Exception Start Time Exception End Time Exception Duration (minutes) Min Measurement Max Measurement Lower Threshold Upper Threshold Maximal Delta From Thresholds ------ -------------------- ------------------ ---------------------------- --------------- --------------- --------------- --------------- ----------------------------- Sensor A 2016-01-01 10:30:00 2016-01-02 08:00:00 1290 59.00 66.00 -50.00 50.00 16.00 Sensor B 2016-01-01 08:00:00 2016-01-01 13:00:00 300 39.00 88.00 40.00 80.00 8.00 Sensor B 2016-01-02 08:00:00 2016-01-02 17:00:00 540 95.00 95.00 40.00 80.00 15.00 Sensor C 2016-01-01 10:00:00 2016-01-01 23:30:00 810 -2.00 -1.00 0.00 100.00 -2.00 */
我可能會使用類似下面的東西。
它能夠使用索引順序並避免排序,直到它到達最終
GROUP BY
結果(對我來說,它使用流聚合)原則上,這個最終的分組操作實際上是不需要的。應該可以讀取按順序排序的輸入流
Sensor, MeasureTime
並以流方式輸出所需的結果,但我認為您需要為此編寫一個 SQLCLR 過程。WITH T1 AS (SELECT m.*, s.[Lower Threshold], s.[Upper Threshold], within_threshold, start_group_flag = IIF(within_threshold = 0 AND LAG(within_threshold, 1, 1) OVER (PARTITION BY m.[Sensor] ORDER BY [Measure Time]) = 1, 1, 0), next_measure_time = LEAD([Measure Time]) OVER (PARTITION BY m.[Sensor] ORDER BY [Measure Time]), overage = IIF(Measurement > [Upper Threshold], Measurement - [Upper Threshold], 0), underage =IIF(Measurement < [Lower Threshold], Measurement - [Lower Threshold], 0) FROM [Measurements] m JOIN [Sensors] s ON m.Sensor = s.Sensor CROSS APPLY (SELECT IIF(m.[Measurement] BETWEEN s.[Lower Threshold] AND s.[Upper Threshold],1,0)) ca(within_threshold)), T2 AS (SELECT *, group_number = SUM(start_group_flag) OVER (PARTITION BY [Sensor] ORDER BY [Measure Time] ROWS UNBOUNDED PRECEDING) FROM T1 WHERE within_threshold = 0) SELECT Sensor, [Exception Start Time] = MIN([Measure Time]), [Exception End Time] = MAX(ISNULL(next_measure_time, [Measure Time])), [Exception Duration (minutes)] = DATEDIFF(MINUTE, MIN([Measure Time]), MAX(ISNULL(next_measure_time, [Measure Time]))), [Min Measurement] = MIN(Measurement), [Max Measurement] = MAX(Measurement), [Lower Threshold], [Upper Threshold], [Maximal Delta From Thresholds] = IIF(MAX(overage) > -MIN(underage), MAX(overage), MIN(underage)) FROM T2 GROUP BY group_number, Sensor, [Lower Threshold], [Upper Threshold]