Postgresql

使用開始和結束事件日誌 創建一個表/視圖,其中包含每個日誌時間之間的跨度

  • June 28, 2020

具體來說,我有一個事件表,用於記錄使用者加入或離開團隊的時間。它看起來像下面這樣:

-------------------------------------
| user | event  | team | timestamp |
-------------------------------------
| A    | joined | 1    | 2016-1-1  |
| B    | joined | 1    | 2016-1-1  |
| C    | left   | 1    | 2016-1-1  |
| C    | joined | 2    | 2016-1-1  |
| A    | left   | 1    | 2016-1-2  |
| A    | joined | 2    | 2016-1-2  |
| B    | left   | 1    | 2016-1-3  |
| A    | left   | 2    | 2016-1-3  |
-------------------------------------

我需要對其進行重組以使其看起來如下所示

--------------------------------------
| user | team | joined    | left     |
--------------------------------------
| A    | 1    | 2016-1-1  | 2016-1-2 |
| A    | 2    | 2016-1-2  | 2016-1-3 |
| B    | 1    | 2016-1-1  | 2016-1-3 |
| C    | 1    | null      | 2016-1-1 |
| C    | 2    | 2016-1-1  | null     |
--------------------------------------

我怎樣才能做到這一點?

有關更多詳細資訊,我正在嘗試在 Amazon Redshift (PostgreSQL) 中執行此操作

假設所有列NOT NULL。並且“離開”永遠不會早於相關的“加入”。

簡單案例

如果使用者只能加入一次團隊(理想情況下,這將通過對 的UNIQUE約束來強制執行("user", team)),那麼解決方案很簡單GROUP BY,並且適用於 Redshift 以及大多數任何 RDBMS:

SELECT "user", team
    , min(CASE WHEN event = 'joined' THEN timestamp END) AS joined
    , max(CASE WHEN event = 'left'   THEN timestamp END) AS "left"
FROM   event
GROUP  BY "user", team
ORDER  BY "user", joined NULLS FIRST;

注意NULLS FIRST子句。好像你想先對一個開放的開始進行排序joined IS NULLRedshift 也支持這一點。

除此之外,它是交叉表/數據透視查詢的最基本形式。

沒那麼簡單

從您的列名和範例數據來看,它可能不是那麼簡單。如果使用者可以多次加入團隊(不重疊),你必須做更多。您不希望像在此相關答案中那樣將多個團隊成員合併到一行中:

相反,您必須以某種方式配對相鄰的“加入”和“左”行。有很多方法…

Postgres 9.4+

對於現代 Postgres,我最喜歡這個:

SELECT "user", team
    , min(timestamp) FILTER (WHERE event = 'joined') AS joined
    , max(timestamp) FILTER (WHERE event = 'left'  ) AS "left"
FROM  **(
SELECT *, count(*) FILTER (WHERE event = 'joined')
OVER (PARTITION BY "user", team ORDER BY timestamp) AS ct
FROM event
) sub**
GROUP  BY "user", team, ct
ORDER  BY "user", joined NULLS FIRST;

FILTER在視窗函式和聚合函式中使用聚合子句。相關(帶有替代方案的連結):

這樣我們就可以計算同一個使用者加入同一個團隊的次數,這樣我們就可以對相鄰的行進行分組。也適用於'joined'開頭缺失或'left'結尾缺失的情況。

紅移

…不支持新FILTER條款。我們可以用一個普通的 old 代替CASE

SELECT "user", team
    , min(CASE WHEN event = 'joined' THEN timestamp END) AS joined
    , max(CASE WHEN event = 'left'   THEN timestamp END) AS "left"
FROM  (
  SELECT *, count(CASE WHEN event = 'joined' THEN 1 END)
                     OVER (PARTITION BY "user", team ORDER BY timestamp, event) AS ct
  FROM   event
  ) sub
GROUP  BY "user", team, ct
ORDER  BY "user", joined NULLS FIRST;

SQL小提琴。


**另外:**您不應該使用保留字作為標識符,即使 Redshift(或 Postgres)允許。

您可以使用條件聚合來獲取 ‘joined’ / ’left’ 值:

SELECT "user", team, 
      MAX(CASE WHEN event = 'joined' THEN timestamp END) AS joined,
      MAX(CASE WHEN event = 'left' THEN timestamp END) AS left
FROM mytable
GROUP BY "user", team
ORDER BY "user", team

展示在這裡

引用自:https://dba.stackexchange.com/questions/125930