Mysql

MySQL 多個子查詢 - 排序緩慢

  • November 6, 2021

知道如何優化此查詢嗎?

ORDER BY使用(~10 seconds) 子句時執行速度非常慢。如果我刪除 order_by 子句,它會執行幾毫秒。

問題是,我需要能夠使用子查詢的結果對結果進行排序。需要能夠通過以下方式訂購:num_products, credits_ordered, credits_consumed, num_refunds, amount_refunded,last_uploaded_datelast_order_date

查詢:

SELECT DISTINCT `users`.`id` as `user_id`, `users`.`email` as `user_email`, 

(select count(products.id) from products where products.user_id = users.id and products.added >= '2021-09-29' and products.added < '2021-10-29') as num_products, 

(select COALESCE(sum(amount), 0) from credits_history where credits_history.user_id = users.id and credits_history.type = 1 and credits_history.credits_type = 1 and credits_history.added >= '2021-09-29' and credits_history.added < '2021-10-29') as credits_ordered, 

(select COALESCE(sum(amount), 0) from credits_history where credits_history.user_id = users.id and credits_history.type = 2 and credits_history.credits_type = 1 and credits_history.added >= '2021-09-29' and credits_history.added < '2021-10-29') as credits_consumed, 

COALESCE(sum((select count(id) from credits_history where credits_history.user_id = users.id and credits_history.type = 3 and credits_history.credits_type = 1 and credits_history.added >= '2021-09-29' and credits_history.added < '2021-10-29')), 0)
+
COALESCE(sum((select count(id) from credits_history where credits_history.user_id = users.id and credits_history.type = 4 and credits_history.amount > 0 and credits_history.order_id IS NULL and credits_history.credits_type = 1 and credits_history.added >= '2021-09-29' and credits_history.added < '2021-10-29') ), 0) as num_refunds, 

COALESCE(sum((select sum(amount) from credits_history where credits_history.user_id = users.id and credits_history.type = 3 and credits_history.credits_type = 1 and credits_history.added >= '2021-09-29' and credits_history.added < '2021-10-29')), 0) 
+
COALESCE(sum((select sum(amount) from credits_history where credits_history.user_id = users.id and credits_history.type = 4 and credits_history.amount > 0 and credits_history.order_id IS NULL and credits_history.credits_type = 1 and credits_history.added >= '2021-09-29' and credits_history.added < '2021-10-29')), 0) as amount_refunded, 

(select added from products where products.user_id = users.id order by id desc limit 1) as last_uploaded_date, 
(select added from credits_history where credits_history.user_id = users.id and credits_history.type = 1 order by credits_history.id desc limit 1) as last_order_date
FROM `users`
WHERE `users`.`is_subuser` =0
GROUP BY `user_id`
ORDER BY `num_products` DESC
LIMIT 0,10

數據庫架構:

CREATE TABLE `users` (
 `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
 `email` varchar(100) NOT NULL,
 `is_subuser` tinyint(1) DEFAULT '0',
 `customer_type` tinyint(1) DEFAULT '1',
 `last_update` timestamp NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
 PRIMARY KEY (`id`),
 KEY `is_subuser` (`is_subuser`)
) ENGINE=InnoDB AUTO_INCREMENT=25006 DEFAULT CHARSET=utf8;

CREATE TABLE `products` (
 `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
 `user_id` int(11) NOT NULL,
 `is_deleted` tinyint(1) DEFAULT '0',
 `added` timestamp NULL DEFAULT CURRENT_TIMESTAMP,
 `updated` timestamp NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
 PRIMARY KEY (`id`),
 KEY `is_user_id_is_deleted` (`id`,`user_id`,`is_deleted`),
 KEY `user_id_is_deleted` (`user_id`,`is_deleted`),
 KEY `id_is_deleted` (`id`,`is_deleted`),
 KEY `added` (`added`)
) ENGINE=InnoDB AUTO_INCREMENT=196921 DEFAULT CHARSET=utf8;


CREATE TABLE `credits_history` (
 `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
 `user_id` int(11) NOT NULL,
 `amount` int(11) NOT NULL,
 `type` tinyint(1) NOT NULL,
 `credits_type` tinyint(1) NOT NULL,
 `order_id` bigint(11) DEFAULT NULL,
 `added` timestamp NULL DEFAULT CURRENT_TIMESTAMP,
 `updated` timestamp NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
 PRIMARY KEY (`id`),
 KEY `user_id` (`user_id`),
 KEY `user_id_credits_type_added` (`user_id`,`credits_type`,`added`),
  KEY `order_id` (`order_id`)
) ENGINE=InnoDB AUTO_INCREMENT=176204 DEFAULT CHARSET=utf8;

解釋:

       +----+--------------------+-----------------+------------+-------+---------------------------------------------------------------------------------+----------------------------+---------+----------------+------+----------+--------------------------------------------------------+
   | id | select_type        | table           | partitions | type  | possible_keys                                                                   | key                        | key_len | ref            | rows | filtered | Extra                                                  |
   +----+--------------------+-----------------+------------+-------+---------------------------------------------------------------------------------+----------------------------+---------+----------------+------+----------+--------------------------------------------------------+
   |  1 | PRIMARY            | users           | NULL       | range | PRIMARY,can_login,is_subuser_master_id,is_subuser,magento_customer_id,master_id | is_subuser                 | 2       | NULL           | 6720 |   100.00 | Using index condition; Using temporary; Using filesort |
   | 10 | DEPENDENT SUBQUERY | credits_history | NULL       | ref   | user_id,user_id_credits_type_added,user_id_slave_id                             | user_id_credits_type_added | 4       | func           |   85 |    10.00 | Using index condition; Using where; Using filesort     |
   |  9 | DEPENDENT SUBQUERY | products        | NULL       | ref   | user_id_is_deleted                                                              | user_id_is_deleted         | 4       | func           |   79 |   100.00 | Using index condition; Using filesort                  |
   |  8 | DEPENDENT SUBQUERY | credits_history | NULL       | ref   | user_id,user_id_credits_type_added,user_id_slave_id,order_id                    | user_id_slave_id           | 4       | tf_db.users.id |   90 |     0.06 | Using index condition; Using where                     |
   |  7 | DEPENDENT SUBQUERY | credits_history | NULL       | ref   | user_id,user_id_credits_type_added,user_id_slave_id                             | user_id_slave_id           | 4       | tf_db.users.id |   90 |     0.11 | Using index condition; Using where                     |
   |  6 | DEPENDENT SUBQUERY | credits_history | NULL       | ref   | user_id,user_id_credits_type_added,user_id_slave_id,order_id                    | user_id_slave_id           | 4       | tf_db.users.id |   90 |     0.06 | Using index condition; Using where                     |
   |  5 | DEPENDENT SUBQUERY | credits_history | NULL       | ref   | user_id,user_id_credits_type_added,user_id_slave_id                             | user_id_slave_id           | 4       | tf_db.users.id |   90 |     0.11 | Using index condition; Using where                     |
   |  4 | DEPENDENT SUBQUERY | credits_history | NULL       | ref   | user_id,user_id_credits_type_added,user_id_slave_id                             | user_id_slave_id           | 4       | func           |   90 |     0.11 | Using index condition; Using where                     |
   |  3 | DEPENDENT SUBQUERY | credits_history | NULL       | ref   | user_id,user_id_credits_type_added,user_id_slave_id                             | user_id_slave_id           | 4       | func           |   90 |     0.11 | Using index condition; Using where                     |
   |  2 | DEPENDENT SUBQUERY | products        | NULL       | ref   | user_id_is_deleted,added                                                        | user_id_is_deleted         | 4       | func           |   79 |     0.93 | Using index condition; Using where                     |
   +----+--------------------+-----------------+------------+-------+---------------------------------------------------------------------------------+----------------------------+---------+----------------+------+----------+--------------------------------------------------------+

為了簡單起見,我從表模式中刪除了一些欄位/索引,因為表有很多欄位。

我懷疑是否有任何理由在同一個查詢中同時DISTINCT使用兩者。GROUP BY

讓我們把查詢從裡到外。但是,這會增加複雜性——即您需要對每個不同的ORDER BYs.

目前,這個查詢就是我所說的“explode-implode”。它在很多表中命中很多行,只是後來將結果歸結為 10 行。我們需要盡量減少為 10 行查找 id 的工作量,然後完成所有其他工作。

要點是

SELECT ids.user_id,
      ids.num_products,
      ( ... ) AS credits_ordered,
      ( ... ) AS ...,
      ( ... ) AS ...,
      COALESCE( ... ) AS ...,
      COALESCE( ... ) AS ...
   FROM (
       SELECT users.id AS user_id
           FROM `users`, num_products
           JOIN .... (( to get num_products ))
           GROUP BY users.id
           ORDER BY `num_products` DESC
           LIMIT 0,10
        ) AS ids
   ORDER BY ids.num_products DESC

id 是可能需要觸摸很多行來獲取ids子查詢,但只需要觸摸 10 行來獲取其他每個子查詢。

另一種方法(可能效果更好或效果更差):

其餘的似乎都取決於

           from  credits_history
           where  credits_history.user_id = users.id
             and  credits_history.credits_type = 1
             and  credits_history.added >= '2021-09-29'
             and  credits_history.added <  '2021-10-29'

(加上對type和的測試amount

所以,嘗試使用

  JOIN ( SELECT user_id, type, amount, credits_history
           FROM credits_history
           WHERE  credits_history.credits_type = 1
             and  credits_history.added >= '2021-09-29'
             and  credits_history.added <  '2021-10-29'
       ) AS ch
     ON  ch.user_id = users.id

type然後粘amount在子查詢中。

第三個想法是做類似的事情(假設您使用的是 8.0 或 10.2)

WITH ch AS
   ( SELECT user_id, type, amount, credits_history
           FROM credits_history
           WHERE  credits_history.credits_type = 1
             and  credits_history.added >= '2021-09-29'
             and  credits_history.added <  '2021-10-29'
   )

同時,其中一些可能會有所幫助:

users:  INDEX(is_subuser,  id, email)
products:  INDEX(user_id, added,  id)
credits_history:  INDEX(user_id, credits_type, type, added)
credits_history:  INDEX(user_id, type, amount, order_id, credits_type, added)

清理一些索引:

 KEY `user_id` (`user_id`),  -- Drop because redundant with the following:
 KEY `user_id_credits_type_added` (`user_id`, `credits_type`, `added`),

 KEY ... (id, ...)  -- Drop because redundant with PRIMARY KEY(id)

一般情況下,使用COUNT(*)代替COUNT(id).

您的解釋計劃看起來像是從 credits_history 表中讀取了 7 次。一般來說,如果可以的話,最好只從表格中讀取一次。多次讀取肯定會為查詢增加大量執行時間。

我會嘗試重寫它,以便您只從 credits_history 中讀取一次。使 where 子句足夠通用,以便一次讀取所有組中需要的所有行。然後在這些行的事實之後進行條件求和。

像這樣的東西。(我使用的是 Postgresql,所以語法可能有點偏離)

select
 c.result_one,
 c.result_one
 ...
from 
`users` u
left join lateral (
 select
   sum(case when c.history_type = 1 then c.amount else 0 end) as result_one,
   sum(case when c.history_type = 2 and c.some_other_thing = 'foo' then c.amount else 0) as result_two,
 from `credits_history` c
 where 
   c.user_id = u.id
   ...--set the rest of the where clause
      --such that you read all credits_history rows that will be included anywhere in your dataset
) c on true
--and your other joins as they make sense results. make sure it could use indexes if appropriate.
;

引用自:https://dba.stackexchange.com/questions/301876