Mysql

MySQL按慢速排序

  • December 23, 2019

我有一個名為“客戶”的表,具有以下結構和索引:

DESCRIBE customer;
+--------------------+---------------+------+-----+-------------------+-----------------------------+
| Field              | Type          | Null | Key | Default           | Extra                       |
+--------------------+---------------+------+-----+-------------------+-----------------------------+
| id                 | int(11)       | NO   | PRI | NULL              | auto_increment              |
| email              | varchar(70)   | NO   | UNI | NULL              |                             |
| orders             | int(11)       | YES  |     | NULL              |                             |
| country            | text          | YES  |     | NULL              |                             |
| last_date_order    | datetime      | YES  |     | NULL              |                             |
| first_date_order   | datetime      | YES  |     | NULL              |                             |
| name               | varchar(100)  | YES  |     | NULL              |                             |
| lastname           | varchar(100)  | YES  |     | NULL              |                             |
| male               | tinyint(1)    | YES  |     | NULL              |                             |
| gender_probability | decimal(10,2) | YES  |     | NULL              |                             |
| phone              | varchar(20)   | YES  |     | NULL              |                             |
| phone_formatted    | varchar(20)   | YES  |     | NULL              |                             |
| created_at         | timestamp     | YES  |     | NULL              |                             |
| updated_at         | timestamp     | NO   | MUL | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP |
+--------------------+---------------+------+-----+-------------------+-----------------------------+

SHOW INDEX FROM customer;
+----------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table    | Non_unique | Key_name   | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+----------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| customer |          0 | PRIMARY    |            1 | id          | A         |     1899746 |     NULL | NULL   |      | BTREE      |         |               |
| customer |          0 | email      |            1 | email       | A         |     1899746 |     NULL | NULL   |      | BTREE      |         |               |
| customer |          1 | updated_at |            1 | updated_at  | A         |         200 |     NULL | NULL   |      | BTREE      |         |               |
+----------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

我可以執行這個查詢,大約需要 15 秒:

SELECT * FROM customer WHERE updated_at >= '2018-02-16 00:00:00';
931817 rows in set (14.33 sec)

這是相同的查詢,但添加了一個order by

SELECT * FROM customer WHERE updated_at >= '2018-02-16 00:00:00' ORDER BY updated_at ASC;
931817 rows in set (5 min 21.17 sec)

我認為 5 分鐘太慢了,查詢的解釋顯示:

EXPLAIN SELECT * FROM customer WHERE updated_at >= '2018-02-16 00:00:00' ORDER BY updated_at ASC;
+----+-------------+----------+------+---------------+------+---------+------+---------+-----------------------------+
| id | select_type | table    | type | possible_keys | key  | key_len | ref  | rows    | Extra                       |
+----+-------------+----------+------+---------------+------+---------+------+---------+-----------------------------+
|  1 | SIMPLE      | customer | ALL  | updated_at    | NULL | NULL    | NULL | 1899744 | Using where; Using filesort |
+----+-------------+----------+------+---------------+------+---------+------+---------+-----------------------------+

簡介是這樣說的:

SET profiling = 1;
SELECT * FROM customer WHERE updated_at >= '2018-02-16 00:00:00' ORDER BY updated_at ASC;
SHOW PROFILE;
SET profiling = 0;

+------------------------------+------------+
| Status                       | Duration   |
+------------------------------+------------+
| Sending data                 |   0.036616 |
| Waiting for query cache lock |   0.000034 |
| Sending data                 |   0.088051 |
| Waiting for query cache lock |   0.000034 |
| Sending data                 |   0.064104 |
| Waiting for query cache lock |   0.000038 |
| Sending data                 |   0.045420 |
| Waiting for query cache lock |   0.000043 |
| Sending data                 |   0.035219 |
| Waiting for query cache lock |   0.000025 |
| Sending data                 |   0.049755 |
| Waiting for query cache lock |   0.000035 |
| Sending data                 |   0.069008 |
| Waiting for query cache lock |   0.000036 |
| Sending data                 |   0.068619 |
| Waiting for query cache lock |   0.000042 |
| Sending data                 |   0.067039 |
| Waiting for query cache lock |   0.000048 |
| Sending data                 |   0.051564 |
| Waiting for query cache lock |   0.000042 |
| Sending data                 |   0.052017 |
| Waiting for query cache lock |   0.000036 |
| Sending data                 |   0.049105 |
| Waiting for query cache lock |   0.000034 |
| Sending data                 |   0.060379 |
| Waiting for query cache lock |   0.000029 |
| Sending data                 |   0.118096 |
| Waiting for query cache lock |   0.000027 |
| Sending data                 |   0.046678 |
| Waiting for query cache lock |   0.000037 |
| Sending data                 |   0.058142 |
| Waiting for query cache lock |   0.000027 |
| Sending data                 |   0.137221 |
| Waiting for query cache lock |   0.000046 |
| Sending data                 |   0.061968 |
| Waiting for query cache lock |   0.000045 |
| Sending data                 |   0.054456 |
| Waiting for query cache lock |   0.000045 |
| Sending data                 |   0.054738 |
| Waiting for query cache lock |   0.000025 |
| Sending data                 |   0.073921 |
| Waiting for query cache lock |   0.000028 |
| Sending data                 |   0.040226 |
| Waiting for query cache lock |   0.000034 |
| Sending data                 |   0.036024 |
| Waiting for query cache lock |   0.000046 |
| Sending data                 |   0.063350 |
| Waiting for query cache lock |   0.000035 |
| Sending data                 |   0.044174 |
| Waiting for query cache lock |   0.000028 |
| Sending data                 |   0.057658 |
| Waiting for query cache lock |   0.000031 |
| Sending data                 |   0.081529 |
| Waiting for query cache lock |   0.000041 |
| Sending data                 |   0.074443 |
| Waiting for query cache lock |   0.000027 |
| Sending data                 |   0.058293 |
| Waiting for query cache lock |   0.000033 |
| Sending data                 |   0.061443 |
| Waiting for query cache lock |   0.000037 |
| Sending data                 |   0.075515 |
| Waiting for query cache lock |   0.000035 |
| Sending data                 |   0.075264 |
| Waiting for query cache lock |   0.000039 |
| Sending data                 |   0.049696 |
| Waiting for query cache lock |   0.000027 |
| Sending data                 |   0.052088 |
| Waiting for query cache lock |   0.000036 |
| Sending data                 |   0.049119 |
| Waiting for query cache lock |   0.000039 |
| Sending data                 |   0.053930 |
| Waiting for query cache lock |   0.000028 |
| Sending data                 |   0.064894 |
| Waiting for query cache lock |   0.000039 |
| Sending data                 |   0.046105 |
| Waiting for query cache lock |   0.000027 |
| Sending data                 |   0.061840 |
| Waiting for query cache lock |   0.000033 |
| Sending data                 |   0.065950 |
| Waiting for query cache lock |   0.000038 |
| Sending data                 |   0.053379 |
| Waiting for query cache lock |   0.000028 |
| Sending data                 |   0.034599 |
| Waiting for query cache lock |   0.000027 |
| Sending data                 |   0.046371 |
| Waiting for query cache lock |   0.000033 |
| Sending data                 |   0.039776 |
| Waiting for query cache lock |   0.000035 |
| Sending data                 |   0.046480 |
| Waiting for query cache lock |   0.000040 |
| Sending data                 |   0.039515 |
| Waiting for query cache lock |   0.000036 |
| Sending data                 | 405.206648 |
| end                          |   0.000040 |
| query end                    |   0.000011 |
| closing tables               |   0.000017 |
| freeing items                |   0.000036 |
| logging slow query           |   0.000006 |
| logging slow query           |   0.000011 |
| cleaning up                  |   0.000006 |
+------------------------------+------------+
100 rows in set (0.01 sec)

我會很感激為什麼order by對性能影響如此之大的任何想法?15s 到 5mins+ 是一個很大的跳躍。

解釋 RolandoMySQLDBA 的查詢

SELECT B.* FROM (SELECT id FROM customer WHERE updated_at >= '2018-02-16 00:00:00' ORDER BY updated_at ASC) A LEFT JOIN customer B USING (id);
932016 rows in set (6 min 6.39 sec)

+----+-------------+------------+--------+---------------+------------+---------+------+--------+--------------------------+
| id | select_type | table      | type   | possible_keys | key        | key_len | ref  | rows   | Extra                    |
+----+-------------+------------+--------+---------------+------------+---------+------+--------+--------------------------+
|  1 | PRIMARY     | <derived2> | ALL    | NULL          | NULL       | NULL    | NULL | 932025 |                          |
|  1 | PRIMARY     | B          | eq_ref | PRIMARY       | PRIMARY    | 4       | A.id |      1 |                          |
|  2 | DERIVED     | customer   | range  | updated_at    | updated_at | 4       | NULL | 949956 | Using where; Using index |
+----+-------------+------------+--------+---------------+------------+---------+------+--------+--------------------------+

根據您在問題中的內容

  • 1899746表中的行
  • 931817行 >2018-02-16 00:00:00
  • 索引上updated_at有 200 個不同的值

EXPLAIN 計劃顯示了一個全表掃描和一個包含所有列的臨時表。這是有道理的,因為 WHERE 子句看到了 49.0496% 的表行。做一個完整的表比從鍵分佈中找出訪問索引也需要訪問表更容易。這就解釋了為什麼 EXPLAIN 計劃沒有選擇updated_at索引。

updated_at通過執行這個可以看到索引中鍵值的分佈

SELECT COUNT(1) rowcount,updated_at FROM customer GROUP BY updated_at;

您可以先使用索引來收集 id,然後再加入它們

SELECT B.* FROM
(SELECT id FROM customer
WHERE updated_at >= '2018-02-16 00:00:00'
ORDER BY updated_at ASC) A
LEFT JOIN customer B USING (id);

如果收集 ids 的子查詢選擇索引,那麼整個查詢可能會很快。請對此執行 EXPLAIN 並查看是否是這種情況。

如果這是您的應用程序的關鍵部分,您可以更改PRIMARY KEY開頭updated_at(並將現有主鍵更改為UNIQUE鍵/約束):

-- table/index reconstruction
-- this will take some time
-- but only needs to be done once
ALTER TABLE customer
 DROP PRIMARY KEY,
 DROP INDEX updated_at,
 ADD PRIMARY KEY (updated_at, id),
 ADD UNIQUE INDEX id_uq (id) ;

上面的改動會使表的聚群鍵基於updated_at,所以查詢會有2個好處:

  • 只做部分錶掃描,
  • 跳過排序部分,因為行已經按照您想要的方式排序。

引用自:https://dba.stackexchange.com/questions/198224