Postgresql

建構一個包含來自不同表的聚合值的選擇查詢

  • August 19, 2020

我在 Postgres 中有兩個表:民意調查和投票。

第一個polls旨在包含與民意調查相關的數據。每次民意調查的目的都是只有兩種可能的回應——option_aoption_b

+----+---------+----------+----------+------------+
| id | author  | option_a | option_b | created_at |
+----+---------+----------+----------+------------+
| 1  | user_01 | apple    | banana   | 2020/08/15 |
+----+---------+----------+----------+------------+
| 2  | user_02 | tea      | coffee   | 2020/08/16 |
+----+---------+----------+----------+------------+

第二個votes是保存有關投票的數據:

+---------+---------+--------+------------+
| poll_id | voter   | option | voted_at   |
+---------+---------+--------+------------+
| 1       | user_01 | apple  | 2020/08/15 |
+---------+---------+--------+------------+
| 1       | user_02 | banana | 2020/08/15 |
+---------+---------+--------+------------+
| 1       | user_03 | banana | 2020/08/15 |
+---------+---------+--------+------------+
| 1       | user_04 | apple  | 2020/08/15 |
+---------+---------+--------+------------+
| 1       | user_05 | apple  | 2020/08/15 |
+---------+---------+--------+------------+
| 2       | user_01 | tea    | 2020/08/16 |
+---------+---------+--------+------------+
| 2       | user_08 | coffee | 2020/08/16 |
+---------+---------+--------+------------+

我想要做的是選擇有票數的民意調查。例如,使用 選擇 poll 的數據id = 1,我希望得到:

+---------+----------+----------+------------+
| poll_id | option_a | option_b | created_at |
+---------+----------+----------+------------+
| 1       | 3        | 2        | 2020/08/15 |
+---------+----------+----------+------------+

如何編寫這樣的查詢?

您可以為此使用過濾聚合:

select v.poll_id, 
      count(*) filter (where v.option = p.option_a) as option_a, 
      count(*) filter (where v.option = p.option_b) as option_b,
      max(p.created_at) as created_at
from votes v 
 join polls p on p.id = v.poll_id
group by v.poll_id
order by v.poll_id;

max(p.created_at)是使人group by快樂的必要條件。

線上範例

一次查詢所有民意調查

使用聚合FILTER子句,就像已經建議的 a_horse 一樣。看:

SELECT p.id AS poll_id  -- ① PK column!
    , min(ct) FILTER (WHERE v.option = p.option_a) AS option_a 
    , min(ct) FILTER (WHERE v.option = p.option_b) AS option_b
    , p.created_at     -- ① no aggregate
FROM  polls p
LEFT  JOIN (  -- ②
  SELECT poll_id, option, count(*)::int AS ct
  FROM   votes
  GROUP  BY 1, 2
  ) v ON v.poll_id = p.id
GROUP  BY 1
ORDER  BY 1;

① 假設 ,以polls.id這種方式覆蓋,不必聚合。看:PRIMARY KEY``polls.created_at

② 先聚合,後加入。這通常更快。看:

LEFT JOIN保持投票結果中沒有投票。

crosstab()查詢一項民意調查

crosstab()由附加模組 tablefunc 提供的通常更快。但是,在這種情況下,每次輪詢只有兩個選項並且需要訪問表polls兩次,它可能會更慢:

SELECT v.*, p.created_at
FROM   crosstab(
  'SELECT poll_id, option, count(*)::int
   FROM   votes
   WHERE  poll_id = 1
   GROUP  BY 1, 2
   ORDER  BY 1, 2'

 ,'SELECT o.*
   FROM   polls, LATERAL (VALUES (option_a), (option_b)) o
   WHERE  id = 1'
  ) AS v (poll_id int, option_a int, option_b int)
JOIN polls p ON p.id = v.poll_id;

看:

db<>在這裡擺弄

引用自:https://dba.stackexchange.com/questions/273861