Mysql

如何在行間計算之前對記錄進行排序?

  • April 1, 2017

我有一個時序數據集要分析,但困難在於

  1. 數據集非常大。
  2. 時序事件由具有關聯的不同對象送出。
  3. 它在mysql中,沒有像索引列這樣的id來加入

例如

----------------
dt     obj        comp
----------------
t1,    object_a,   component_1
t2,    object_b,   component_1
t3,    object_b,   component_2
t4,    object_a,   component_1
t5,    object_b,   component_1
t6,    object_b,   component_2
----------------

我想知道每個對象的每個組件的每次送出之間的延遲。即 obj_a.comp_1.delay1 = obj_a.compoent_1.t4 - obj_a.component_1.t1

等等。

我嘗試加入

select timediff( t1.dt, t.dt ) as delay
from  table as t 
join table as t1
on t1.comp = t.comp
and t1.obj = t.obj
and t1.dt = (
   select min(t2.dt)
   from table as t2
   where t2.obj = t.obj 
   and t2.comp = t.comp 
   and t2.dt > t.dt
)

這需要永遠,所以我在想如果有什麼方法可以對結果進行排序並使用變數進行行間計算,它會快得多。但它在子查詢和視圖中都失敗了,在計算後總是進行排序。從我讀到的order by,這似乎是不可能的。那麼我必須有哪些選擇才能相當快地實現這一目標?

更具體地說:如果我執行以下操作:

SELECT  
   obj
   , comp
   , dt - @prev
   , @prev := dt
FROM 
  table
ORDER BY obj, comp, dt ASC

結果是

obja, comp1, t1-null
obja, comp1, t4-t3      # should be t4-t1
objb, comp1, t2-t1      
objb, comp1, t5-t4      # should be t5-t2
objb, comp2, t3-t2
objb, comp2, t6-t5      # should be t6-t3

由於計算發生在未排序的數據上,因此結果沒有意義。我的表格實際上是一個視圖,組件是計算欄位。它沒有索引。join-subquery 方式可能需要半個小時。但是,如果我將排序結果保存到表格中並使用變數進行計算,則只需幾分鐘。數據集很大,我只測試了一小部分。我不確定保存到額外表的方法是可擴展的解決方案。所以我正在尋找替代方案的建議。

我已經成功地使用變數和排序來解決你的問題。這是我的測試設置:

CREATE TABLE atable
   (`dt` datetime, `obj` varchar(8), `comp` varchar(11))
;

INSERT INTO atable
   (`dt`, `obj`, `comp`)
VALUES
   ('2016-09-13 06:00:01', 'object_a', 'component_1'),
   ('2016-09-13 06:00:02', 'object_b', 'component_1'),
   ('2016-09-13 06:00:04', 'object_b', 'component_2'),
   ('2016-09-13 06:00:08', 'object_a', 'component_1'),
   ('2016-09-13 06:00:16', 'object_b', 'component_1'),
   ('2016-09-13 06:00:32', 'object_b', 'component_2')
;

這是查詢:

SELECT
 dt,
 obj,
 comp,
 diff
FROM
 (SELECT @obj := '', @comp := '', @dt := CAST('1970-01-01' AS datetime)) AS var,
 (
   SELECT
     TIMESTAMPDIFF(SECOND, IF(obj = @obj AND comp = @comp, @dt, NULL), dt) AS diff,
     @dt := dt AS dt,
     @obj := obj AS obj,
     @comp := comp AS comp
   FROM
     atable
   ORDER BY
     obj,
     comp,
     dt
 ) AS sub
;

正如這個 SQL Fiddle 展示將顯示的那樣,查詢返回以下輸出:

dt                   obj       comp         diff
-------------------  --------  -----------  ----
2016-09-13 06:00:01  object_a  component_1  NULL
2016-09-13 06:00:08  object_a  component_1  7
2016-09-13 06:00:02  object_b  component_1  NULL
2016-09-13 06:00:16  object_b  component_1  14
2016-09-13 06:00:04  object_b  component_2  NULL
2016-09-13 06:00:32  object_b  component_2  28

引用自:https://dba.stackexchange.com/questions/149296