Postgresql

根據兩個時間戳列按年份分組數據

  • April 7, 2016

我的data表有以下列:

id INTEGER, name TEXT, created TIMESTAMP, deleted TIMESTAMP

我想生成一個name每年活躍的每個(可以在表格中出現多次)計數的報告。(此外,如果deleted時間戳目前仍處於活動狀態,則時間戳可能為空)。

到目前為止,我已經設法通過在一長串聯合聲明中手動輸入年份來做到這一點(見下文)。我相信有更好的方法!我還有更多類似的查詢要執行。我試圖創建一個 PL/pgSQL 函式,但無法弄清楚如何將年份作為變數以及如何獲得正確的輸出。我會很高興有一個語句或 PL/pgSQL 函式來實現這一點。

((select '2016' yr, name, count(*) from data
where (((deleted - '2016-01-01'::timestamp) > '0 secs') or (deleted is null))
and (created - '2016-01-01'::timestamp) <= '0 secs'
group by name
order by count desc)
union all
((select '2015' yr, name, count(*) from data
where (((deleted - '2015-01-01'::timestamp) > '0 secs') or (deleted is null))
and (created - '2015-01-01'::timestamp) <= '0 secs'
group by name
order by count desc)
etc..

我得到了多年使用:

select distinct date_part('year',created) from data order by date_part('year',created);

然後在很長的聯合語句中手動輸入它們。(在我的情況下是 2007-2016 年!)

generate_series()連接LATERAL(Postgres 9.3+)和date_trunc()中,這可以很簡短:

SELECT EXTRACT(YEAR FROM yr)::text AS year, name, count(*) AS ct
FROM   data
    , generate_series(date_trunc('year', created)  -- LATERAL join
                    , COALESCE(deleted, localtimestamp)
                    , interval '1 year') yr
GROUP  BY yr, name
ORDER  BY yr, ct DESC;

就這樣。返回最早年份和目前年份之間所有年份的結果。 訣竅是在聚合之前為基行重疊的每一年生成一行。created

選擇:

SELECT EXTRACT(YEAR FROM yr) AS year, d.name, count(*) AS count
FROM   generate_series ((SELECT date_trunc('year', min(created)) FROM data)
                      , localtimestamp, interval '1 year') yr
JOIN   data d ON d.created < yr::timestamp + interval '1 year'
           AND (d.deleted > yr::timestamp OR d.deleted IS NULL)
GROUP  BY yr, d.name
ORDER  BY count(*) DESC;

這會產生加入前的整個年份範圍。計算手動選擇年份的數字可能更方便。

有關的:

如果您需要從更大的表中優化一tsrange部分年份的性能,則可以選擇針對類型的 GiST 索引:

引用自:https://dba.stackexchange.com/questions/134391