Postgresql
對多個 OUTER JOIN 進行計數和分組
我正在創建廣告日誌視圖(展示次數、點擊次數和每次展示的點擊次數)。我有一個簡單的表結構和一些工作查詢,但是我在將它們組合成一個可以用作視圖(不是物化視圖,因為這將是實時數據)的單個查詢時遇到了一些麻煩。
這些表是:
CREATE TABLE advert ( id integer NOT NULL PRIMARY KEY ); CREATE TABLE advert_event ( code CHAR(1) NOT NULL PRIMARY KEY ); CREATE TABLE advert_log ( advertisement integer NOT NULL REFERENCES advert(id), event_code CHAR(1) NOT NULL REFERENCES advert_event(code) );
以及一些涵蓋所有可能情況的範例數據:
INSERT INTO advert VALUES (1); INSERT INTO advert VALUES (2); INSERT INTO advert VALUES (3); INSERT INTO advert VALUES (4); INSERT INTO advert_event VALUES ('I'); -- Impression INSERT INTO advert_event VALUES ('C'); -- Click INSERT INTO advert_log VALUES (1, 'I'); INSERT INTO advert_log VALUES (1, 'C'); INSERT INTO advert_log VALUES (2, 'I'); INSERT INTO advert_log VALUES (2, 'I'); INSERT INTO advert_log VALUES (2, 'C'); INSERT INTO advert_log VALUES (3, 'I'); INSERT INTO advert_log VALUES (3, 'I');
作為參考,這裡有一組我想要計算的東西
advert_log
:查詢 A。
SELECT * FROM advert,advert_event;
結果 A。
id | code ----+------ 1 | I 1 | C 2 | I 2 | C 3 | I 3 | C 4 | I 4 | C (8 rows)
廣告事件計數:
查詢 B。
SELECT DISTINCT advertisement,event_code,COUNT(*) OVER (PARTITION BY advertisement,event_code) FROM advert_log;
結果 B。
advertisement | event_code | count ---------------+------------+------- 1 | I | 1 1 | C | 1 2 | I | 2 2 | C | 1 3 | I | 1 (5 rows)
對於任何單個廣告,可以通過以下查詢獲得正確的計數:
查詢 C1。
SELECT COUNT(*) FROM advert_log WHERE advertisement=4 AND event_code='I'; count ------- 0 (1 row)
查詢 C2。
SELECT COUNT(*) FROM advert_log WHERE advertisement=4 AND event_code='C'; count ------- 0 (1 row)
當然,我之前的查詢不包括零計數,因此它沒有捕捉到上述兩種情況中的任何一種。
最終,我試圖做的是將上述數字轉換為以下數字,使用
clicks
(“C”條目)除以impressions
(“I”條目)得出cpi
列:advertisement | impressions | clicks | cpi ---------------+-------------+--------+----- 1 | 1 | 1 | 1.0 2 | 2 | 1 | 0.5 3 | 1 | 0 | 0.0 4 | 0 | 0 | 0.0 <- or NULL, NaN, 1.0, ...
我最初的方法是為查詢 C1 和 C2 創建一個視圖,並從基於查詢 A 的視圖中呼叫該函式。
我懷疑有一種更簡單的方法可以通過單個查詢來實現我的目標。
在撰寫問題時,我能夠找到解決方案,但我決定發布問題和答案,以防將來對其他人有所幫助。或者,如果有人對此問題有更簡單或性能更好的解決方案,我會很高興聽到。
我設法通過在 OUTER JOIN 之前使用 CROSS JOIN 來解決我的 NULL 計數問題:
SELECT * FROM advert CROSS JOIN advert_event LEFT OUTER JOIN advert_log ON advert_log.advertisement=advert.id AND advert_log.event_code=advert_event.code; id | code | advertisement | event_code ----+------+---------------+------------ 1 | I | 1 | I 1 | C | 1 | C 2 | I | 2 | I 2 | I | 2 | I 2 | C | 2 | C 3 | I | 3 | I 3 | I | 3 | I 3 | C | | 4 | I | | 4 | C | | (10 rows)
上面給了我需要的中間表。添加分組和計數,我終於得到了我正在尋找的數字:
SELECT advert.id,advert_event.code,COUNT(advert_log.advertisement) FROM advert CROSS JOIN advert_event LEFT OUTER JOIN advert_log ON advert_log.advertisement=advert.id AND advert_log.event_code=advert_event.code GROUP BY advert.id,advert_event.code; id | code | count ----+------+------- 1 | C | 1 1 | I | 1 2 | C | 1 2 | I | 2 3 | C | 0 3 | I | 2 4 | C | 0 4 | I | 0 (8 rows)
最後,使用兩個子選擇(一個用於“I”,一個用於“C”),我編寫了一個查詢以通過廣告獲取計數:
CREATE VIEW advertisement_dashboard AS SELECT a.id,i.impressions,c.clicks,c.clicks::float/greatest(i.impressions, 1) AS cpi FROM advert a, ( SELECT advert.id,COUNT(advert_log.advertisement) AS impressions FROM advert CROSS JOIN advert_event LEFT OUTER JOIN advert_log ON advert_log.advertisement=advert.id AND advert_log.event_code=advert_event.code GROUP BY advert.id,advert_event.code HAVING code='I' ) i, ( SELECT advert.id,COUNT(advert_log.advertisement) AS clicks FROM advert CROSS JOIN advert_event LEFT OUTER JOIN advert_log ON advert_log.advertisement=advert.id AND advert_log.event_code=advert_event.code GROUP BY advert.id,advert_event.code HAVING code='C' ) c WHERE i.id=a.id AND c.id=a.id ORDER BY a.id ASC; SELECT * FROM advertisement_dashboard; id | impressions | clicks | cpi ----+-------------+--------+----- 1 | 1 | 1 | 1 2 | 2 | 1 | 0.5 3 | 2 | 0 | 0 4 | 0 | 0 | 0 (4 rows)