Mysql
如何在行間計算之前對記錄進行排序?
我有一個時序數據集要分析,但困難在於
- 數據集非常大。
- 時序事件由具有關聯的不同對象送出。
- 它在mysql中,沒有像索引列這樣的id來加入
例如
---------------- dt obj comp ---------------- t1, object_a, component_1 t2, object_b, component_1 t3, object_b, component_2 t4, object_a, component_1 t5, object_b, component_1 t6, object_b, component_2 ----------------
我想知道每個對象的每個組件的每次送出之間的延遲。即 obj_a.comp_1.delay1 = obj_a.compoent_1.t4 - obj_a.component_1.t1
等等。
我嘗試加入
select timediff( t1.dt, t.dt ) as delay from table as t join table as t1 on t1.comp = t.comp and t1.obj = t.obj and t1.dt = ( select min(t2.dt) from table as t2 where t2.obj = t.obj and t2.comp = t.comp and t2.dt > t.dt )
這需要永遠,所以我在想如果有什麼方法可以對結果進行排序並使用變數進行行間計算,它會快得多。但它在子查詢和視圖中都失敗了,在計算後總是進行排序。從我讀到的
order by
,這似乎是不可能的。那麼我必須有哪些選擇才能相當快地實現這一目標?更具體地說:如果我執行以下操作:
SELECT obj , comp , dt - @prev , @prev := dt FROM table ORDER BY obj, comp, dt ASC
結果是
obja, comp1, t1-null obja, comp1, t4-t3 # should be t4-t1 objb, comp1, t2-t1 objb, comp1, t5-t4 # should be t5-t2 objb, comp2, t3-t2 objb, comp2, t6-t5 # should be t6-t3
由於計算發生在未排序的數據上,因此結果沒有意義。我的表格實際上是一個視圖,組件是計算欄位。它沒有索引。join-subquery 方式可能需要半個小時。但是,如果我將排序結果保存到表格中並使用變數進行計算,則只需幾分鐘。數據集很大,我只測試了一小部分。我不確定保存到額外表的方法是可擴展的解決方案。所以我正在尋找替代方案的建議。
我已經成功地使用變數和排序來解決你的問題。這是我的測試設置:
CREATE TABLE atable (`dt` datetime, `obj` varchar(8), `comp` varchar(11)) ; INSERT INTO atable (`dt`, `obj`, `comp`) VALUES ('2016-09-13 06:00:01', 'object_a', 'component_1'), ('2016-09-13 06:00:02', 'object_b', 'component_1'), ('2016-09-13 06:00:04', 'object_b', 'component_2'), ('2016-09-13 06:00:08', 'object_a', 'component_1'), ('2016-09-13 06:00:16', 'object_b', 'component_1'), ('2016-09-13 06:00:32', 'object_b', 'component_2') ;
這是查詢:
SELECT dt, obj, comp, diff FROM (SELECT @obj := '', @comp := '', @dt := CAST('1970-01-01' AS datetime)) AS var, ( SELECT TIMESTAMPDIFF(SECOND, IF(obj = @obj AND comp = @comp, @dt, NULL), dt) AS diff, @dt := dt AS dt, @obj := obj AS obj, @comp := comp AS comp FROM atable ORDER BY obj, comp, dt ) AS sub ;
正如這個 SQL Fiddle 展示將顯示的那樣,查詢返回以下輸出:
dt obj comp diff ------------------- -------- ----------- ---- 2016-09-13 06:00:01 object_a component_1 NULL 2016-09-13 06:00:08 object_a component_1 7 2016-09-13 06:00:02 object_b component_1 NULL 2016-09-13 06:00:16 object_b component_1 14 2016-09-13 06:00:04 object_b component_2 NULL 2016-09-13 06:00:32 object_b component_2 28