MySQL中多個列的嵌套聚合和分組
考慮下
product
表(高度精簡):`id` int AUTO_INCREMENT `category_id` int `subcategory_id` int `vendor_id` int `price` decimal(6,2) `inserted_at` timestamp
對於給定的類別 ID,我正在嘗試檢索包含每個子類別的最新價格最低的供應商的列表。“最新”是指供應商可能對給定的類別 ID/子類別 ID 組合有多個價格,因此只應使用該類別 ID/子類別 ID/供應商 ID 的最新插入價格。如果 2 個或更多供應商的價格之間存在平局,則應使用最低的 id 作為平局者。
例如,使用此數據:
id | category_id | subcategory_id | vendor_id | price | inserted_at --------------------------------------------------------------------------- 1 | 1 | 2 | 3 | 16.00 | 2015-07-23 04:00:00 2 | 1 | 1 | 2 | 9.00 | 2015-07-26 08:00:00 3 | 1 | 2 | 4 | 16.00 | 2015-08-02 10:00:00 4 | 1 | 1 | 1 | 7.00 | 2015-08-04 11:00:00 5 | 1 | 1 | 1 | 11.00 | 2015-08-09 16:00:00
因此,首先找到每個子類別/供應商組合的最新價格(行將
price=7.00
被刪除,因為它不是該子類別中該供應商的最新價格)。那麼對於子類別 1,最低價格是 9(因此 vendor_id = 2),對於子類別 2,最低價格是 16(兩個供應商並列 ()ids 3 和 4),所以我們選擇具有最低 vendor_id = 3 的那個)。我希望得到以下結果
category_id = 1
:subcategory_id | vendor_id | price ---------------------------------- 1 | 2 | 9.00 2 | 3 | 16.00
這是我到目前為止所擁有的。我覺得它已經開始失控了,這甚至沒有考慮到 2 個或更多供應商的價格之間的聯繫。
SELECT c.subcategory_id, c.vendor_id, c.price FROM products AS c JOIN ( SELECT MIN(a.price) AS min_price, a.subcategory_id FROM products AS a JOIN ( SELECT MAX(`inserted_at`) AS latest_price_time, vendor_id, subcategory_id FROM products WHERE category_id = 1 GROUP BY vendor_id, subcategory_id ) AS b ON a.inserted_at = b.latest_price_time AND a.vendor_id = b.vendor_id AND a.subcategory_id = b.subcategory_id WHERE a.category_id = 1 GROUP BY a.subcategory_id ) AS d ON c.price = d.min_price AND c.subcategory_id = d.subcategory_id WHERE c.category_id = 1
在我走得更遠之前,我想看看是否有更簡單的方法。當涉及到對其他分組/聚合的分組/聚合結果時,是否有一種方法可以為我提供最佳性能(最重要)和/或更易於閱讀(不太重要)?
這是一個“greatest-n-per-group”查詢,在 MySQL 中編寫非常複雜 - 首先是由於缺少視窗函式,其次是因為您有 2 個 best-n-per-group 規範,首先是每個組的最新日期供應商和第二為每個子類別的最低價格。
這是一種相當複雜的編寫方式:
SELECT ps.subcategory_id, ps.vendor_id, ps.price -- , p.inserted_at FROM ( SELECT DISTINCT subcategory_id FROM product WHERE category_id = 1 ) AS s JOIN product AS ps ON ps.category_id = 1 AND ps.subcategory_id = s.subcategory_id AND ps.id = ( SELECT psv.id FROM ( SELECT DISTINCT subcategory_id, vendor_id FROM product WHERE category_id = 1 ) AS sv JOIN product AS psv ON psv.category_id = 1 AND psv.subcategory_id = sv.subcategory_id AND psv.vendor_id = sv.vendor_id AND psv.inserted_at = ( SELECT pi.inserted_at FROM product AS pi WHERE pi.category_id = 1 AND pi.subcategory_id = sv.subcategory_id AND pi.vendor_id = sv.vendor_id ORDER BY pi.inserted_at DESC LIMIT 1 ) WHERE sv.subcategory_id = s.subcategory_id ORDER BY psv.price, psv.vendor_id LIMIT 1 ) ;
在SQLfiddle-2中測試。一個計劃也不錯,有一個適當的索引
(category_id, subcategory_id, vendor_id, inserted_at)
。它可能不是最有效的,我肯定會嘗試使用索引(參見 Fiddle,我還有一個索引。它可能不是很有用,但在更大的表上測試 t。)
( SQLfile-1中查詢的第一個版本)