顯示與最大值關聯的值
我有一個超市的 PostgreSQL 數據庫(這是一個玩具問題),我需要找到哪個商店銷售每種產品的副本最多,並將其顯示在一個查詢中,該查詢還顯示產品名稱、描述、所有商店的庫存副本,在所有商店購買的副本。
我想我對前幾列進行了適當的查詢,如下所示:
SELECT a.ProductName, a.ProductDescription, a.StockSum, b.PurchaseSum, c.MaxSales FROM (SELECT Product.Name AS ProductName, Product.Description AS ProductDescription , SUM(Stock.copies) AS StockSum FROM Product INNER JOIN Stock ON Stock.product_id = Product.product_id GROUP BY Product.name, Product.description) AS a FULL JOIN (SELECT Product.name AS ProductName, Product.description AS ProductDescription , SUM(PurchaseItem.copies) AS PurchaseSum FROM Product INNER JOIN PurchaseItem ON PurchaseItem.product_id = Product.product_id GROUP BY Product.name, Product.description) AS b;
但是我一輩子都無法弄清楚如何在特定商店中提取與特定商店對應
Outlet.name
的所有 SUM 的 MAX 相關聯的適當值。這似乎是一個非常複雜的查詢,它讓我很困惑!PurchaseItem.copies``product_id
數據庫的結構是
Purchase
referencesOutlet
,而PurchaseItem
referencesPurchase
和一個單一的Product
(PurchaseItem.copies
在購買中記錄該產品的銷售數量)。
您可以在這裡使用視窗函式,但我認為實際上有一個更好的解決方案
DISTINCT ON
。首先,我簡化了您到目前為止所擁有的內容:
SELECT p.name AS product_name, p.description AS product_description , a.stock_sum, b.purchase_sum , c.max_sales, o.outlet_name -- still missing FROM Product p LEFT JOIN ( SELECT product_id, SUM(copies) AS stock_sum FROM Stock GROUP BY 1 ) a USING (product_id) LEFT JOIN ( SELECT product_id, sum(copies) AS purchase_sum FROM PurchaseItem GROUP BY 1 ) b USING (product_id) -- c, o still missing
在加入之前聚合計數應該要快得多:
此外,
LEFT JOIN
在結果中保留尚未購買或不再有庫存的產品。然後添加缺少的部分:
LEFT JOIN ( SELECT DISTINCT ON (product_id) pi.product_id, pu.outlet_id, sum(copies) AS max_sales FROM Purchase pu JOIN PurchaseItem pi USING (purchase_id) GROUP BY 1, 2 ORDER BY 1, sum(copies) DESC NULLS LAST ) c USING (product_id) LEFT JOIN Outlet o USING (outlet_id);
關於
DISTINCT ON
:您可以執行
DISTINCT
聚合的結果。考慮查詢中的事件序列:優化性能
PurchaseItem
使用 CTE 只掃描一次可能更便宜。但這也增加了一些成本。您必須測試哪個更快:WITH ct AS ( SELECT pi.product_id, pu.outlet_id, sum(pi.copies) AS sales FROM PurchaseItem pi JOIN Purchase pu USING (purchase_id) GROUP BY 1, 2 ) SELECT p.name AS product_name, p.description AS product_description , a.stock_sum, b.purchase_sum , c.max_sales, o.outlet_name FROM Product p LEFT JOIN ( SELECT product_id, SUM(copies) AS stock_sum FROM Stock GROUP BY 1 ) a USING (product_id) LEFT JOIN ( SELECT product_id, sum(sales) AS purchase_sum FROM ct GROUP BY 1 ) b USING (product_id) LEFT JOIN ( SELECT DISTINCT ON (product_id) product_id, outlet_id, sales AS max_sales FROM ct ORDER BY product_id, sales DESC ) c USING (product_id) LEFT JOIN Outlet o USING (outlet_id);
測試性能
EXPLAIN ANALYZE
(幾次以排除記憶體影響)。
如果您為所涉及的表添加架構以及一些範例數據,我想我可以給您一個更好的答案。話雖如此,如果我理解您要正確完成的工作,我可能會使用更像下面的查詢。以我閱讀您的查詢的方式,當您需要將結果限制在某個商店和項目時,我並沒有真正看到 SUM 的 MAX 有什麼意義。總之,見下文。它比您的查詢簡單一些,別名也不同,但我相信應該很容易理解。
SELECT p.Name AS [Product Name], p.Description AS [Product Description], SUM(stock.copies) AS [Stock Sum], SUM(i.copies) AS [Purchase Sum], MAX(SUM(i.copies)) AS [Max Sum Purchase Items], o.OutletName AS [Outlet Name] FROM Product as p INNER JOIN Stock s ON s.product_id = p.product_id INNER JOIN PurchaseItem i ON i.product_id = p.product_id INNER JOIN Outlet o ON o.outlet_id = p.outlet_id WHERE p.product_id = <xxxxxxx> AND o.outlet_id = <xxxxxxx> GROUP BY p.name, p.Description, o.OutletName
更新查詢
SELECT p.Name AS [Product Name], p.Description AS [Product Description], SUM(stock.copies) AS [Stock Sum], SUM(i.copies) AS [Purchase Sum], MAX(SELECT SUM(i2.copies) FROM PurchaseItem i2 INNER JOIN Products p2 ON i2.product_id = p2.product_ID INNER JOIN Outlets o2 ON o2.outlet_id = p2.outlet_ID WHERE p2.product_ID = p.product_ID AND o2.outlet_ID = o.outlet_ID GROUP BY p2.product_ID, o2.outlet_ID) AS [Max Sum Purchase Items], o.OutletName AS [Outlet Name] FROM Product as p INNER JOIN Stock s ON s.product_id = p.product_id INNER JOIN PurchaseItem i ON i.product_id = p.product_id INNER JOIN Outlet o ON o.outlet_id = p.outlet_id WHERE p.product_id = <xxxxxxx> AND o.outlet_id = <xxxxxxx> GROUP BY p.name, p.Description, o.OutletName
那也許就是這個。