Mysql

即使沒有任何 ORDER BY,創建排序索引也需要 96% 的查詢時間

  • August 14, 2020

我有這個查詢:

SELECT itr.recipe_id,
  SUM(itr.weight),
  SUM(aprice_weight),
  SUM(itr.weight+aprice_weight) AS score
FROM
 (SELECT itfin.id AS flyer_item_id,
         itfin.flyer_id,
         MAX(itfin.max_weight) AS aprice_weight,
         itfin.ingredient_id
  FROM
    (SELECT MAX(price_weight) AS max_weight,
            flyer_items.id,
            flyer_items.flyer_id,
            ingredient_to_flyer_item.ingredient_id
     FROM flyer_items
     JOIN ingredient_to_flyer_item ON flyer_items.id = ingredient_to_flyer_item.flyer_item_id
     WHERE flyer_items.flyer_id IN (2)
     GROUP BY ingredient_to_flyer_item.flyer_item_id) AS itfin
  JOIN ingredient_to_flyer_item ON itfin.id = ingredient_to_flyer_item.flyer_item_id
  GROUP BY itfin.ingredient_id) AS itf
INNER JOIN `ingredient_to_recipe` AS `itr` ON `itf`.`ingredient_id` = `itr`.`ingredient_id`
GROUP BY `itr`.`recipe_id`
ORDER BY `score` DESC
LIMIT 12

但即使沒有 ORDER BY 子句,執行時間還是一樣的,大約 350 毫秒。我嘗試查看索引以了解我能做什麼,但到目前為止沒有任何效果。甚至沒有在 MySQL 設置中為 SORT_BUFFER_SIZE 設置更大的值。

分析器告訴我,單個創建排序索引(共 3 個)佔用了 96% 的查詢時間。

這是解釋結果:

| id | select_type | table                    | partitions | type   | possible_keys           | key           | key_len | ref                      | rows | filtered | extra                                        |
|----|-------------|--------------------------|------------|--------|-------------------------|---------------|---------|--------------------------|------|----------|----------------------------------------------|
| 1  | PRIMARY     | <derived2>               | NULL       | ALL    | NULL                    | NULL          | NULL    | NULL                     | 583  | 100.00   | Using temporary; Using filesort              |
| 1  | PRIMARY     | itr                      | NULL       | ref    | recipe_id,ingredient_id | ingredient_id | 4       | itf.ingredient_id        | 35   | 100.00   | NULL                                         |
| 1  | PRIMARY     | r                        | NULL       | eq_ref | id                      | id            | 4       | metadata2.itr.recipe_id  | 1    | 10.00    | Using where                                  |
| 2  | DERIVED     | <derived3>               | NULL       | ALL    | NULL                    | NULL          | NULL    | NULL                     | 246  | 100.00   | Using temporary; Using filesort              |
| 2  | DERIVED     | ingredient_to_flyer_item | NULL       | ref    | flyer_item_id           | flyer_item_id | 4       | itfin.id                 | 2    | 100.00   | Using index                                  |
| 3  | DERIVED     | flyer_items              | NULL       | ALL    | id_2,id,flyer_id        | NULL          | NULL    | NULL                     | 104  | 100.00   | Using where; Using temporary; Using filesort |
| 3  | DERIVED     | ingredient_to_flyer_item | NULL       | ref    | flyer_item_id           | flyer_item_id | 4       | metadata2.flyer_items.id | 2    | 100.00   | NULL                                         |

任何有助於加快此查詢速度的幫助將不勝感激。謝謝!

更新

連結到 .sql 展示表:https ://nofile.io/f/F4YSEu8DWmT/meta.zip

應該調查 ‘key’ 列中的EXPLAIN’ 輸出中的每一行。NULL標記為(derivedN)的那一行不是罪魁禍首,而是受害者。在您的情況下,flyer_items表需要索引:(id)、、和。我無法預測優化器會選擇最後兩個中的哪一個。(flyer_id)``(id, flyer_id)``(flyer_id, id)

然後由JOINs 生成的派生表將獲得合適的繼承索引以供進一步處理。

我相信您可以簡化您的查詢,其中有一個嵌套的總和,我不確定它有什麼用途。這是您的查詢的一種簡化:

SELECT itr.recipe_id
    , SUM(itr.weight)
    , SUM(aprice_weight)
    , SUM(itr.weight+aprice_weight) AS score
FROM (
   SELECT itfin.id AS flyer_item_id
        , itfin.flyer_id
        , itfin.max_weight AS aprice_weight
        , itfin.ingredient_id
   FROM (
       SELECT MAX(price_weight) AS max_weight
            , flyer_items.id
            , flyer_items.flyer_id
            , ingredient_to_flyer_item.ingredient_id
       FROM flyer_items
       JOIN ingredient_to_flyer_item 
           ON flyer_items.id = ingredient_to_flyer_item.flyer_item_id
       WHERE flyer_items.flyer_id IN (2)
       GROUP BY ingredient_to_flyer_item.ingredient_id
              , flyer_items.flyer_id
              , flyer_items.id  -- full group by
   ) AS itfin
) AS itf
JOIN `ingredient_to_recipe` AS `itr` 
   ON `itf`.`ingredient_id` = `itr`.`ingredient_id`
GROUP BY `itr`.`recipe_id`
ORDER BY `score` DESC
LIMIT 12;

這可以進一步簡化:

SELECT itr.recipe_id
    , SUM(itr.weight)
    , SUM(max_weight)
    , SUM(itr.weight+max_weight) AS score
FROM (  
   SELECT MAX(price_weight) AS max_weight
        , flyer_items.id
        , flyer_items.flyer_id
        , ingredient_to_flyer_item.ingredient_id
   FROM flyer_items
   JOIN ingredient_to_flyer_item 
       ON flyer_items.id = ingredient_to_flyer_item.flyer_item_id
   WHERE flyer_items.flyer_id IN (2)
   GROUP BY ingredient_to_flyer_item.ingredient_id
          , flyer_items.flyer_id
          , flyer_items.id
) AS itf
JOIN ingredient_to_recipe AS itr
   ON itf.ingredient_id = itr.ingredient_id
GROUP BY itr.recipe_id
ORDER BY score DESC
LIMIT 12;

通過 JOIN 謂詞,您可能會考慮對索引進行以下更改:

ALTER TABLE ingredient_to_flyer_item
   ADD KEY (flyer_item_id);

ALTER TABLE `flyer_items` DROP KEY `id`;

ALTER TABLE `flyer_items` DROP KEY `flyer_id`;

ALTER TABLE flyer_items ADD UNIQUE KEY (flyer_id, id);

不過,我沒有時間非常徹底地檢查索引。這是最後一個查詢的解釋,對索引進行了這些更改:

+------+-------------+--------------------------+------+---------------+---------------+---------+----------------------+------+----------------------------------------------+
| id   | select_type | table                    | type | possible_keys | key           | key_len | ref                  | rows | Extra                                        |
+------+-------------+--------------------------+------+---------------+---------------+---------+----------------------+------+----------------------------------------------+
|    1 | PRIMARY     | <derived2>               | ALL  | NULL          | NULL          | NULL    | NULL                 |   67 | Using temporary; Using filesort              |
|    1 | PRIMARY     | itr                      | ref  | ingredient_id | ingredient_id | 4       | itf.ingredient_id    |   30 |                                                 |
|    2 | DERIVED     | flyer_items              | ref  | id_2,flyer_id | flyer_id      | 4       | const                |   67 | Using where; Using temporary; Using filesort |
|    2 | DERIVED     | ingredient_to_flyer_item | ref  | flyer_item_id | flyer_item_id | 4       | test3.flyer_items.id |    1 |                                              |
+------+-------------+--------------------------+------+---------------+---------------+---------+----------------------+------+----------------------------------------------+

編輯:可能的簡化,從 SELECT 和 GROUP BY 中刪除 flyer_items.id、flyer_items.flyer_id

SELECT itr.recipe_id
    , SUM(itr.weight)
    , SUM(max_weight)
    , SUM(itr.weight+max_weight) AS score
FROM (  
   SELECT MAX(price_weight) AS max_weight
        , ingredient_to_flyer_item.ingredient_id
   FROM flyer_items
   JOIN ingredient_to_flyer_item 
       ON flyer_items.id = ingredient_to_flyer_item.flyer_item_id
   WHERE flyer_items.flyer_id IN (2)
   GROUP BY ingredient_to_flyer_item.ingredient_id
) AS itf
JOIN ingredient_to_recipe AS itr
   ON itf.ingredient_id = itr.ingredient_id
GROUP BY itr.recipe_id
ORDER BY score DESC
LIMIT 12;

引用自:https://dba.stackexchange.com/questions/208395