Innodb
如何使用 ORDER BY 優化涉及 GROUP_CONCAT 函式的 SQL 查詢
我編寫了以下 SQL 查詢來對結果進行分組,然後按日期對每個組中的結果進行排序:
SELECT origin, provider, SUBSTRING_INDEX(GROUP_CONCAT(price ORDER BY departure_date SEPARATOR ','), ',', 10) FROM `flights` WHERE origin = 'LHR' AND destination IN ('FCO', 'CDG', 'AMS') AND cabin_type = 'ECONOMY' GROUP BY origin , provider;
第二個查詢:
SELECT destination, provider, SUBSTRING_INDEX(GROUP_CONCAT(price ORDER BY departure_date SEPARATOR ','), ',', 10) FROM `flights` WHERE origin = 'LHR' AND destination IN ('FCO', 'CDG', 'AMS') AND cabin_type = 'ECONOMY' GROUP BY destination, provider;
InnoDB 表
flights
非常簡單:+----------------+---------------+------+-----+-----------+----------------+ | Field | Type | Null | Key | Default | Extra | +----------------+---------------+------+-----+-----------+----------------+ | id | int(11) | NO | PRI | NULL | auto_increment | | origin | varchar(255) | NO | MUL | NULL | | | destination | varchar(255) | NO | MUL | NULL | | | departure_date | date | NO | MUL | NULL | | | price | decimal(14,2) | NO | | NULL | | | provider | varchar(255) | NO | | airnet | | | cabin_type | varchar(255) | YES | | ECONOMY | | | created_at | datetime | YES | | NULL | | | updated_at | datetime | YES | | NULL | | +----------------+---------------+------+-----+-----------+----------------+
我在列上有索引:
origin, destination, departure_date
. 大約。表中有 1000 萬條記錄。上述查詢大約需要 200 到 600 毫秒。我嘗試使用分析器,發現大部分時間都花在了
Creating sort index
(大約 98% 的整個查詢時間)上。有沒有辦法優化這個查詢。我在 上嘗試了複合索引
origin, provider, departure_date
,但這似乎沒有幫助。也許應該對配置進行更多調整,如果是這樣,必須考慮哪些配置變數?機器有 4 GB 的 RAM、2 個 vCPU 並且專用於 DB(10.3.13-Maria DB)。
測試
WITH cte AS ( SELECT origin, provider, price, departure_date, ROW_NUMBER() OVER (PARTITION BY /* origin, */ provider ORDER BY departure_date) rn FROM flights WHERE origin = 'LHR' AND destination IN ('FCO', 'CDG', 'AMS') AND cabin_type = 'ECONOMY' ) SELECT origin, provider, GROUP_CONCAT(price ORDER BY departure_date SEPARATOR ',') FROM cte WHERE rn <= 10 GROUP BY origin, provider;
**更新:(**程式碼草案,見評論)
WITH cte1 AS ( SELECT origin, provider, price, departure_date, ROW_NUMBER() OVER (PARTITION BY provider ORDER BY departure_date) rn FROM flights WHERE origin = 'LHR' AND destination ='FCO' AND cabin_type = 'ECONOMY' ), cte2 AS ( SELECT origin, provider, price, departure_date from cte1 WHERE rn <= 10), cte3 AS ( SELECT origin, provider, price, departure_date, ROW_NUMBER() OVER (PARTITION BY provider ORDER BY departure_date) rn FROM flights WHERE origin = 'LHR' AND destination = 'CDG' AND cabin_type = 'ECONOMY' ), cte4 AS ( SELECT origin, provider, price, departure_date from cte3 WHERE rn <= 10), cte5 AS ( SELECT origin, provider, price, departure_date, ROW_NUMBER() OVER (PARTITION BY provider ORDER BY departure_date) rn FROM flights WHERE origin = 'LHR' AND destination = 'AMS' AND cabin_type = 'ECONOMY' ), cte6 AS ( SELECT origin, provider, price, departure_date from cte5 WHERE rn <= 10), cte7 AS ( SELECT origin, provider, price, departure_date FROM cte2 UNION ALL SELECT origin, provider, price, departure_date FROM cte4 UNION ALL SELECT origin, provider, price, departure_date FROM cte6 ) cte8 as ( SELECT origin, provider, price, departure_date, ROW_NUMBER() OVER (PARTITION BY provider ORDER BY departure_date) rn FROM cte7) SELECT origin, provider, GROUP_CONCAT(price ORDER BY departure_date SEPARATOR ',') FROM cte8 WHERE rn <= 10 GROUP BY origin, provider;
WHERE origin = 'LHR' AND destination IN ('FCO', 'CDG', 'AMS') AND cabin_type = 'ECONOMY'
乞求以下任一複合索引:
INDEX(origin, cabin_type, destination) INDEX(cabin_type, origin, destination)
origin``destination
看起來他們可以CHAR(3)
代替VARCHAR(255)
.
cabin_type
可能是一個ENUM
,或者至少是一個更短的VARCHAR
.