Mariadb

使用使用 JOIN 的查詢獲取大表的最後幾行

  • February 4, 2022

我想獲取包含超過 10M 行的表(名為 CONTENT)的最後(或最後幾行)行。該查詢包含其他 2 個表的連接,而且速度非常慢。這些是表定義和我的查詢:

CREATE TABLE `USER` (
 `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
 `value` varchar(64) DEFAULT NULL,
 PRIMARY KEY (`id`),
 KEY (`value`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=latin1;


CREATE TABLE `GUID` (
 `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
 `value` char(36) DEFAULT NULL,
 PRIMARY KEY (`id`),
 KEY `value` (`value`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=latin1;


CREATE TABLE `CONTENT` (
 `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
 `user_id` int(11) unsigned DEFAULT NULL,
 `guid_id` int(11) unsigned DEFAULT NULL,
 `timestamp` datetime DEFAULT NULL,
 PRIMARY KEY (`id`),
 KEY `guid` (`guid_id`),
 KEY `user_id` (`user_id`),
 KEY `timestamp` (`timestamp`),

 CONSTRAINT `CONTENT_ibfk_4` FOREIGN KEY (`guid_id`) REFERENCES `GUID` (`id`) ON DELETE CASCADE ON UPDATE CASCADE,
 CONSTRAINT `CONTENT_ibfk_5` FOREIGN KEY (`user_id`) REFERENCES `USER` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;

詢問:

SELECT 
`CONTENT`.`id`,
`GUID`.`value` AS `guid_value`, 
`USER`.`value` AS `user_value` 
FROM `CONTENT`, `USER`, `GUID` 
WHERE `CONTENT`.`user_id` = `USER`.`id` 
AND `CONTENT`.`guid_id` = `GUID`.`id` 
ORDER BY `CONTENT`.`timestamp` DESC LIMIT 1
 # even without ORDER BY the query is slow as seen by explain command

這些是複制為 INSERT 的解釋命令的結果:

+------+-------------+---------+--------+---------------+---------+---------+-----------------------------+-------+----------------------------------------------+
| id   | select_type | table   | type   | possible_keys | key     | key_len | ref                         | rows  | Extra                                        |
+------+-------------+---------+--------+---------------+---------+---------+-----------------------------+-------+----------------------------------------------+
|    1 | SIMPLE      | GUID    | index  | PRIMARY       | value   | 37      | NULL                        | 16329 | Using index; Using temporary; Using filesort |
|    1 | SIMPLE      | CONTENT | ref    | guid,user_id  | guid    | 5       | MANAGER.GUID.id             | 293   | Using where                                  |
|    1 | SIMPLE      | USER    | eq_ref | PRIMARY       | PRIMARY | 4       | MANAGER.CONTENT.user_id     | 1     |                                              |
+------+-------------+---------+--------+---------------+---------+---------+-----------------------------+-------+----------------------------------------------+

該查詢無法使用,因此我將其拆分為 2 個查詢。首先,我檢索感興趣的 CONTENT.id,其次,我WHERE CONTENT.id = x在 SELECT 語句中插入一個附加子句。似乎在原始查詢中 MariaDB 優化器不理解我只需要 1 行,因此它使用 GUID 表中的每一行生成笛卡爾積。是否將查詢拆分為 2 個子查詢?有人可以確認笛卡爾積運算確實是導致問題的運算嗎?(解釋命令結果的第一行)

解釋 Ricks 查詢:

+------+-------------+------------+--------+----------------------+-------------------+---------+-----------------------------+---------+-------------+
| id   | select_type | table      | type   | possible_keys        | key               | key_len | ref                         | rows    | Extra       |
+------+-------------+------------+--------+----------------------+-------------------+---------+-----------------------------+---------+-------------+
|    1 | PRIMARY     | <derived2> | ALL    | NULL                 | NULL              | NULL    | NULL                        | 2       |             |
|    1 | PRIMARY     | CONTENT    | eq_ref | PRIMARY,guid,user_id | PRIMARY           | 4       | c.id                        | 1       | Using where |
|    1 | PRIMARY     | USER       | eq_ref | PRIMARY              | PRIMARY           | 4       | MANAGER.CONTENT.user_id     | 1       |             |
|    1 | PRIMARY     | GUID       | eq_ref | PRIMARY              | PRIMARY           | 4       | MANAGER.CONTENT.guid_id     | 1       |             |
|    2 | DERIVED     | CONTENT    | index  | NULL                 | timestamp         | 6       | NULL                        | 9474301 | Using index |
+------+-------------+------------+--------+----------------------+-------------------+---------+-----------------------------+---------+-------------+

請查看它的執行速度以及它的解釋說:

SELECT  content.`id`, `GUID`.`value` AS `guid_value`, `USER`.`value` AS `user_value`
   FROM  ( SELECT id FROM content ORDER BY timestamp DESC LIMIT 1 ) as c
   JOIN content  ON content.id = c.id
   JOIN `USER`  ON user.id = content.user_id
   JOIN `GUID`  ON guid.id = content.guid_id

引用自:https://dba.stackexchange.com/questions/306989