Mysql

對錶示弱實體類型的表的查詢進行優化

  • October 29, 2018

我一直在嘗試解決這個問題一段時間,所以我決定在這裡試試運氣。我有一個非常複雜的線上商店數據庫結構(一個數據庫中有多個商店,數十萬種產品),我正在查詢 MySQL 以返回指定類別的產品。產品與類別之間存在多對多 (M:N) 關係,因此我有表格product和(代表弱實體類型)。category``product_category

表模式如下:

CREATE TABLE `product` (
 `id` int(11) NOT NULL AUTO_INCREMENT,
 `name` varchar(255) COLLATE utf8_czech_ci NOT NULL,
 `description` text COLLATE utf8_czech_ci NOT NULL,
 `date` timestamp NOT NULL DEFAULT current_timestamp(),
 `date_edited` timestamp NOT NULL DEFAULT current_timestamp(),
 `visibility` tinyint(1) NOT NULL DEFAULT 1,
 `main_slider` tinyint(1) NOT NULL DEFAULT 0,
 `bought` int(11) NOT NULL DEFAULT 0,
 `price_action` int(11) NOT NULL DEFAULT 0,
 `quantity_action` int(11) NOT NULL DEFAULT 0,
 `duration_action` timestamp NOT NULL DEFAULT current_timestamp(),
 `new` tinyint(1) NOT NULL DEFAULT 1,
 `type` int(2) NOT NULL DEFAULT 0,
 `origin_url` varchar(250) COLLATE utf8_czech_ci DEFAULT NULL,
 `origin_price` int(11) NOT NULL DEFAULT 0,
 `origin_price_dph` int(11) NOT NULL DEFAULT 0,
 `rating` int(11) NOT NULL DEFAULT 0,
 `ratingcount` int(11) NOT NULL DEFAULT 0,
 `facebook_flag` tinyint(1) DEFAULT 0,
 `salebot` tinyint(1) NOT NULL DEFAULT 0,
 `shop_supplier_id` int(11) NOT NULL,
 `shop_id` int(11) NOT NULL,
 `product_global_id` int(11) DEFAULT NULL,
 `prioritize` tinyint(1) NOT NULL DEFAULT 0,
 `free_delivery` tinyint(1) NOT NULL DEFAULT 0,
 PRIMARY KEY (`id`),
 UNIQUE KEY `product_global_id_2` (`product_global_id`,`shop_id`),
 KEY `shop_id` (`shop_id`),
 KEY `shop_supplier_id` (`shop_supplier_id`),
 KEY `product_global_id` (`product_global_id`),
 KEY `rating` (`rating`),
 KEY `date` (`date`),
 KEY `quantity_action` (`quantity_action`),
 KEY `duration_action` (`duration_action`),
 KEY `main_slider` (`main_slider`),
 KEY `shop_id_2` (`shop_id`,`main_slider`),
 KEY `shop_id_3` (`shop_id`,`prioritize`),
 KEY `shop_id_4` (`shop_id`,`date`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_czech_ci;

CREATE TABLE `product_category` (
 `category_id` int(11) NOT NULL,
 `product_id` int(11) NOT NULL,
 PRIMARY KEY (`category_id`,`product_id`),
 KEY `category` (`category_id`),
 KEY `product` (`product_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_czech_ci;

我正在使用的查詢:

SELECT product.id, 
      product.name, 
      product.shop_supplier_id, 
      product.price_action, 
      product.quantity_action, 
      product.duration_action, 
      variant.id as variant_id, 
      IF(variant.price_default IS NULL, variant_global.price_vat, variant.price_default) as price_default, 
      variant.price_dph as price, 
      variant_global.delivery_date, 
      variant.value AS variant_name, 
      IF(product.type = 0, variant_global.voc_silver, variant.voc_dph) as voc_dph, 
      shop_supplier.text_available, 
      shop_supplier.text_unavailable, 
      IF(product.type = 0, variant_global.available = 1, variant.availability = 1) as avail, 
      variant.availability_type, 
      supplier.id as supplier_id, 
      variant_global.price_vat as price_default_ds, 
      product.type 
     FROM `product` 
     LEFT JOIN shop_supplier ON (`product`.`shop_supplier_id` = `shop_supplier`.`id`) 
     LEFT JOIN supplier ON (shop_supplier.supplier_id = `supplier`.`id`)      
     LEFT JOIN variant_global ON ((SELECT id FROM variant_global WHERE variant_global.product_global_id=product.product_global_id ORDER BY `available` DESC LIMIT 1) = `variant_global`.`id`) 
     LEFT JOIN product_global ON (variant_global.product_global_id = product_global.id) 
     LEFT JOIN image ON (`product`.`id` = `image`.`product_id`) 
     LEFT JOIN variant ON ((SELECT id FROM variant WHERE `product_id`=product.id ORDER BY `availability` DESC LIMIT 1) = `variant`.`id`) 
     LEFT JOIN product_category ON (`product`.`id` = `product_category`.`product_id`) 
       WHERE (product.shop_id=100000) 
         AND (product.visibility= 1) 
         AND (image.id IS NOT NULL) 
         AND (IF(product.type = 0, variant_global.available = 1, variant.availability = 1) = (CASE WHEN display_available = 1 AND display_unavailable = 1 THEN IF(product.type = 0, variant_global.available = 1, variant.availability = 1) WHEN display_available = 1 THEN 1 ELSE 0 END)) 
         AND (variant_global.deleted IS NULL OR variant_global.deleted = 0) 
         AND (product_category.category_id= 5) 
    GROUP BY product.id 
    ORDER BY prioritize DESC, RAND() LIMIT 5;

該查詢在單個頁面上針對不同類別多次呼叫,而且速度非常慢,我嘗試了所有不同的鍵和組合,但無法弄清楚。

查詢的 EXPLAIN 輸出如下:

+------+--------------------+------------------+--------+-----------------------------------------------+---------------------+---------+-------------------------------------+------+----------------------------------------------+
| id   | select_type        | table            | type   | possible_keys                                 | key                 | key_len | ref                                 | rows | Extra                                        |
+------+--------------------+------------------+--------+-----------------------------------------------+---------------------+---------+-------------------------------------+------+----------------------------------------------+
|    1 | PRIMARY            | product_category | ref    | PRIMARY,category,product                      | PRIMARY             | 4       | const                               | 4264 | Using index; Using temporary; Using filesort |
|    1 | PRIMARY            | product          | eq_ref | PRIMARY,shop_id,shop_id_2,shop_id_3,shop_id_4 | PRIMARY             | 4       | dropohs.product_category.product_id |    1 | Using where                                  |
|    1 | PRIMARY            | shop_supplier    | eq_ref | PRIMARY                                       | PRIMARY             | 4       | dropohs.product.shop_supplier_id    |    1 |                                              |
|    1 | PRIMARY            | supplier         | eq_ref | PRIMARY                                       | PRIMARY             | 4       | dropohs.shop_supplier.supplier_id   |    1 | Using where; Using index                     |
|    1 | PRIMARY            | variant          | eq_ref | PRIMARY                                       | PRIMARY             | 4       | func                                |    1 | Using where                                  |
|    1 | PRIMARY            | image            | ref    | PRIMARY,product_id_2,product_id               | product_id_2        | 4       | dropohs.product_category.product_id |    1 | Using where; Using index                     |
|    1 | PRIMARY            | variant_global   | eq_ref | PRIMARY                                       | PRIMARY             | 4       | func                                |    1 | Using where                                  |
|    3 | DEPENDENT SUBQUERY | variant          | ref    | product_id,product_id_2                       | product_id_2        | 4       | dropohs.product.id                  |    1 | Using where; Using index                     |
|    2 | DEPENDENT SUBQUERY | variant_global   | ref    | product_global_id,product_global_id_2         | product_global_id_2 | 4       | dropohs.product.product_global_id   |    1 | Using where; Using index                     |
+------+--------------------+------------------+--------+-----------------------------------------------+---------------------+---------+-------------------------------------+------+----------------------------------------------+

如果您需要查詢中使用的其他表的任何進一步資訊或模式,請告訴我,但我暫時忽略了它們,因為我認為它們不是問題的原因,因為它們都正確使用了索引。謝謝。

LEFT JOIN imageAND (image.id IS NOT NULL)_ 如果您不想要沒有圖像的產品,那麼INNER JOIN image請忽略AND image.id ...標準。與 相同LEFT JOIN product_category,如果您想要 中的非空類別WHERE,請不要使用LEFT JOIN,[INNER] JOIN是您真正要求的。

訂購RAND()可能也是緩慢的原因。

您的子查詢應該是您在以後的 MySQL 版本中variant* availability 嘗試使用視窗函式的東西。

嘗試將所有類別合併到查詢中並獲得一次而不是多次的結果。

也可以嘗試將此查詢結果記憶體到數據庫之外,這樣您就不需要經常這樣做。

我從未見過這種構造。感覺非常低效;嘗試想出不同的方式來實現意圖:

LEFT JOIN v ON ( SELECT id FROM v ... ) = v.id

您有許多冗餘索引。請注意以下規則:

  • APRIMARY KEY是一個INDEX
  • 如果你有INDEX(a,b),你也不需要INDEX(a)

看著

PRIMARY KEY (`id`),
UNIQUE KEY `product_global_id_2` (`product_global_id`,`shop_id`),
INDEX(shop_id, ...),
INDEX(shop_id, ...),
INDEX(shop_id, ...),
KEY `product_global_id` (`product_global_id`),

我建議

  • 擺脫id(除非它FOREIGN KEY來自其他表),
  • PRIMARY KEY(shop_id, product_global_id)- 以該順序
  • 扔掉它,UNIQUE KEY因為它現在是多餘的。
  • 將最後一個KEY(上面)更改(product_global_id, shop_id)為假設該對按該順序是需要的。
  • 考慮折騰所有以 PK 開頭的索引,shop_id因為它們大多是多餘的。(注意:數據與 PK 聚集在一起,我的新 PK 使查找行的shop_id效率更高。)
  • 更改GROUP BY以匹配我的 PK。

LEFT除非您期望“正確”表可能缺少行,否則不要使用。由於從鏈中的最後一張表EXPLAIN 開始,我懷疑**這些都不是必需的。LEFT JOIN``LEFTs

請注意,如果缺少某些“正確”行,NULLs則將顯示在以下表達式中。這可能會導致令人驚訝的結果。

    AND  (IF(product.type = 0,
                 variant_global.available = 1,
                        variant.availability = 1) =
            (CASE WHEN     display_available = 1
                      AND  display_unavailable = 1
                      THEN IF(product.type = 0, variant_global.available = 1,
                               variant.availability = 1)
                  WHEN display_available = 1 THEN 1 ELSE 0 END)
          )

在這個

         AND (variant_global.deleted IS NULL
           OR variant_global.deleted = 0) 

沒有索引可以使用。最好選擇其中一個NULL0表示“已刪除”並堅持下去。

唉,我真的看不到希望。

一個常見的性能技巧

SELECT ...
   FROM ..
   JOIN ..
   GROUP BY ..
   ORDER BY ..
   LIMIT ..

是做的LIMIT越早:

SELECT ...
   FROM ( SELECT id FROM ... ORDER BY .. LIMIT.. )
   JOIN ..
   -- GROUP BY ..  -- probably no longer needed
   ORDER BY ..     -- (yes, redundantly)
   -- LIMIT ..     -- not needed unless the `JOINs` inflate

那就是嘗試讓 5 個 id 盡可能少地在表周圍執行,然後JOIN(s)只有 5 行要查找。

唉,如果不做幾乎所有的工作,我沒有看到如何做那個“派生”表(子查詢)。

引用自:https://dba.stackexchange.com/questions/221004