Postgresql

根據創建時間選擇返回不同數據的查詢

  • February 26, 2018

我有一個視圖,它將從 table 中選擇數據reports。選擇將根據多個日期過濾數據。

reports table:id, key_id, position, created_at

我的選擇:

pr.key_id,
COALESCE(CASE WHEN date(pr.created_at) = date(current_date - interval '1' day) THEN array_agg(DISTINCT pr.position) ELSE '{0}' END ) AS yesterday,
COALESCE(CASE WHEN date(pr.created_at) = date(current_date - interval '2' day) THEN array_agg(DISTINCT pr.position) ELSE '{0}' END ) AS last_week,
COALESCE(CASE WHEN date(pr.created_at) = date(current_date - interval '3' month) THEN array_agg(DISTINCT pr.position) ELSE '{0}' END ) AS last_month,

即使我按 key_id 和 date 分組,結果也不會分組。

我現在得到的結果範例:

  1. key_id: 1 昨天: {1} last_week: {0} last_month: {0} created_at: ‘2014-04-11’
  2. key_id: 1 昨天: {0} last_week: {1} last_month: {0} created_at: ‘2014-04-10’
  3. key_id: 1 昨天: {0} last_week: {0} last_month: {1} created_at: ‘2014-04-09’

我的期望:

  1. key_id: 1 昨天: {1} last_week: {1} last_month: {1} created_at: ‘2014-04-11’

通過這個查詢,我得到了我想要的。有更好的方法嗎?

SELECT
   k.name,
   array_agg(DISTINCT position),
   (SELECT array_agg(DISTINCT position) FROM project_reports prk WHERE prk.project_id = pr.project_id AND prk.keyword_id = pr.keyword_id AND date(prk.created_at) = date(current_date - interval '1' day)) as y_position,
   (SELECT array_agg(DISTINCT position) FROM project_reports prk WHERE prk.project_id = pr.project_id AND prk.keyword_id = pr.keyword_id AND date(prk.created_at) = date(current_date - interval '7' day)) as last_week,
   (SELECT array_agg(DISTINCT position) FROM project_reports prk WHERE prk.project_id = pr.project_id AND prk.keyword_id = pr.keyword_id AND date(prk.created_at) = date(current_date - interval '1' month)) as last_month,
   (SELECT array_agg(DISTINCT position) FROM project_reports prk WHERE prk.project_id = pr.project_id AND prk.keyword_id = pr.keyword_id AND date(prk.created_at) = date(current_date - interval '3' month)) as three_month,
   (SELECT array_agg(DISTINCT position) FROM project_reports prk WHERE prk.project_id = pr.project_id AND prk.keyword_id = pr.keyword_id AND date(prk.created_at) = date(current_date - interval '6' month)) as six_month,
   (SELECT array_agg(DISTINCT position) FROM project_reports prk WHERE prk.project_id = pr.project_id AND prk.keyword_id = pr.keyword_id AND date(prk.created_at) = date(current_date - interval '12' month)) as last_year,
   date(pr.created_at)
FROM project_keywords pk
INNER JOIN keywords k ON pk.keyword_id = k.id
INNER JOIN project_reports pr USING (project_id, keyword_id)
GROUP BY k.name, date(pr.created_at), pr.project_id, pr.keyword_id

如果我猜對了失去的部分,那麼這個使用crosstab()函式的查詢應該可以完成這項工作(並且比原來的要快得多——除了小基數,一開始就沒有多少收穫):

它要求您首先安裝附加模組tablefunc(每個數據庫一次。此相關答案中的詳細說明:

PostgreSQL Crosstab Query

SELECT name, project_id, positions, created_day
     ,yesterday, last_week, last_month, three_month, six_month, last_year
FROM  (
  SELECT k.name, pr.keyword_id, pr.project_id, pr.created_at::date AS created_day
       , array_agg(DISTINCT pr.position) AS positions
  FROM   project_reports  pr
  JOIN   keywords         k  ON k.id = pr.keyword_id
  GROUP  BY k.id, pr.project_id, created_day -- k.id being the pk
  ) sub
LEFT JOIN crosstab(
   $$SELECT ARRAY[keyword_id, project_id], created_at::date, array_agg(DISTINCT position)
     FROM   project_reports
     WHERE  created_at::date IN (
                   current_date - 1
                 , current_date - 7
                 , (now() - interval '1 mon')::date
                 , (now() - interval '3 mon')::date
                 , (now() - interval '6 mon')::date
                 , (now() - interval '12 mon')::date
                 )
     GROUP  BY 1,2
     ORDER  BY 1,2$$
  ,$$VALUES 
      (current_date - 1)
     ,(current_date - 7)
     ,((now() - interval '1 mon')::date)
     ,((now() - interval '3 mon')::date)
     ,((now() - interval '6 mon')::date)
     ,((now() - interval '12 mon')::date)$$
  ) AS t (
       kp int[]
     , yesterday   int[]
     , last_week   int[]
     , last_month  int[]
     , three_month int[]
     , six_month   int[]
     , last_year   int[]
     ) ON t.kp[1] = keyword_id
      AND t.kp[2] = project_id;

為了分解它,我在交叉表中所做的一個簡單展示:

SELECT * FROM crosstab(
$$SELECT *
  FROM  (
     VALUES
       ('{1,2}'::int[], current_date - 1, '{1,2}'::int[])
     , ('{1,3}'::int[], current_date - 1, '{1,2}'::int[])
     , ('{1,2}', current_date - 7, '{5,6,7}'::int[])
     , ('{1,3}', current_date - 7, '{5,6,7}'::int[])
     , ('{1,2}', (now() - interval '1 mon')::date, '{9,10}'::int[])
      -- 3 mon missing
     , ('{1,2}', (now() - interval '6 mon')::date, '{6,6,6}'::int[])
      -- 12 mon missing
     ) sub
   ORDER BY 1,2$$
  
,$$VALUES 
   (current_date - 1)
  ,(current_date - 7)
  ,((now() - interval '1 mon')::date)
  ,((now() - interval '3 mon')::date)
  ,((now() - interval '6 mon')::date)
  ,((now() - interval '12 mon')::date)$$
) AS t (
    kp int[]
  , yesterday   int[]
  , last_week   int[]
  , last_month  int[]
  , three_month int[]
  , six_month   int[]
  , last_year   int[]
  );

結果:

 kp   | yesterday | last_week | last_month | three_month | six_month | last_year
-------+-----------+-----------+------------+-------------+-----------+-----------
{1,2} | {1,2}     | {5,6,7}   | {9,10}     |             | {6,6,6}   |
{1,3} | {1,2}     | {5,6,7}   |            |             |           |

其餘的應該是顯而易見的。

註釋

  • 我將這兩個部分解耦,並將過多的相關子查詢融合到一個查詢中,這應該會快得多。然後我使用交叉製表從這個查詢中提取各個列。for 的語法crosstab()是棘手的部分。
  • 我沒有加入project_keywords,這似乎只是噪音。我懷疑您的關係模型存在問題。
  • 我似乎你應該project_id在結果中包含它才有意義。否則你不應該 GROUP BY 它……
  • 你可以integer從中減去date

引用自:https://dba.stackexchange.com/questions/63000