Mysql

為什麼 max(id) with join 不起作用或如何通過 join 優化它?

  • November 11, 2019

下面是我的MESSAGES_THREAD表:

CREATE TABLE `messages_thread` (
 `id` varchar(64) NOT NULL,
 `creator` mediumint(9) NOT NULL,
 `other` mediumint(9) NOT NULL,
 `time` datetime DEFAULT CURRENT_TIMESTAMP,
 PRIMARY KEY (`id`),
 KEY `messages_thread_creator_index` (`creator`),
 KEY `messages_thread_other_index` (`other`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci

下面是我的MESSAGES表:

CREATE TABLE `messages` (
 `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
 `userId` mediumint(8) unsigned NOT NULL,
 `threadId` varchar(64) NOT NULL,
 `status` tinyint(3) unsigned DEFAULT ''0'' COMMENT ''0->sent, 1->deliveried, 2->seen'',
 `time` datetime DEFAULT CURRENT_TIMESTAMP,
 `message` text NOT NULL,
 `data_url` varchar(512) DEFAULT NULL,
 PRIMARY KEY (`id`),
 KEY `messages_threadId_status_index` (`threadId`,`status`),
 KEY `index5` (`userId`),
 KEY `message_threadId_id_index` (`threadId`,`id` DESC),
 CONSTRAINT `messages_messages_thread_id_fk` FOREIGN KEY (`threadId`) REFERENCES `messages_thread` (`id`),
 CONSTRAINT `messages_user_userId_fk` FOREIGN KEY (`userId`) REFERENCES `user` (`userId`)
) ENGINE=InnoDB AUTO_INCREMENT=2522 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci

我想MESSAGES根據其 threadId 從表中獲取最後一條消息 id,為此我嘗試了以下兩個Queries結果都可以,但效率不高。第二個Query似乎很有效,但沒有達到我接受的標準。我怎樣才能Query 有效地寫這個?


下面是第一QUERY

SELECT max(m.id), mt.other 
FROM messages_thread mt 
join messages m ON mt.id = m.threadId 
WHERE mt.creator=1 
group by mt.other;

下面是圖形執行計劃:

在此處輸入圖像描述

下面是會話狀態:

Handler_commit  1
Handler_delete  0
Handler_discover    0
Handler_external_lock   4
Handler_mrr_init    0
Handler_prepare 0
Handler_read_first  1
Handler_read_key    41
Handler_read_last   0
Handler_read_next   2561
Handler_read_prev   0
Handler_read_rnd    0
Handler_read_rnd_next   0
Handler_rollback    0
Handler_savepoint   0
Handler_savepoint_rollback  0
Handler_update  0
Handler_write   0

下面是第二個Query(似乎有點優化):

SELECT m2.id, mt.other 
FROM messages_thread mt 
inner join ( select max(id) id, threadId 
          from messages m 
          group by threadId) as m2 on mt.id=m2.threadId
WHERE mt.creator=1;

下面是圖形執行計劃:

在此處輸入圖像描述

下面是會話狀態:

Handler_commit  1
Handler_delete  0
Handler_discover    0
Handler_external_lock   4
Handler_mrr_init    0
Handler_prepare 0
Handler_read_first  1
Handler_read_key    122
Handler_read_last   1
Handler_read_next   0
Handler_read_prev   0
Handler_read_rnd    0
Handler_read_rnd_next   41
Handler_rollback    0
Handler_savepoint   0
Handler_savepoint_rollback  0
Handler_update  0
Handler_write   40

LATERAL使用成員 Madhur Bhaiya 建議的查詢執行詳細資訊

以下是 Madhur Bhaiya 建議的查詢LATERAL

SELECT m2.id, mt.other 
FROM messages_thread mt 
join lateral (select max(m.id) as id
              from messages m 
              where m.threadId = mt.id ) as m2
WHERE mt.creator=1;

下面是圖形執行計劃:

在此處輸入圖像描述

下面是會話狀態:

Handler_commit  1
Handler_delete  0
Handler_discover    0
Handler_external_lock   4
Handler_mrr_init    0
Handler_prepare 0
Handler_read_first  0
Handler_read_key    41
Handler_read_last   0
Handler_read_next   2561
Handler_read_prev   0
Handler_read_rnd    0
Handler_read_rnd_next   80
Handler_rollback    0
Handler_savepoint   0
Handler_savepoint_rollback  0
Handler_update  0
Handler_write   40

下面是解釋數據:

1   PRIMARY mt      ref index5,messages_thread_creator_index,message_creator_other_index    message_creator_other_index 3   const   40  100.00  Using index; Rematerialize (<derived2>)
1   PRIMARY <derived2>      ALL                 2   100.00  
2   DEPENDENT DERIVED   m       ref messages_threadId_status_index,messages_threadId_index,messages_threadId_id_index   messages_threadId_status_index  258 tolet.mt.id 61  100.00  Using index

MySQL 8.0.14 引入了橫向派生表。在您的第二個查詢中,您正在計算max()所有 的值threadId,無論您在 JOIN 期間是否需要它。您可以通過使用以下內容來避免這種實現m2(一個昂貴的過程,基本上是臨時表創建,無論是在記憶體中還是在磁碟中(如果太大)):

SELECT max(m2.id), mt.other 
FROM messages_thread mt 
join lateral ( select max(m.id) as id
              from messages m 
              where m.threadId = mt.id ) as m2
WHERE mt.creator=1 
group by other;

另外,嘗試在主查詢messages_thread中為Group By Optimization定義一個新的表複合索引: . 您可以在有或沒有新索引建議的情況下對這個新查詢進行基準測試,以決定是否保留它。SELECT``(creator, other)

我注意到計劃 1-3 使用全表掃描。小表可能不是問題(據我了解messages_thread只有 40 條記錄)。因此,如果您希望表增長並具有各種 mt.creator 值,那麼您可能希望mt通過索引訪問而不是掃描它。

我認為一旦表增長,查詢優化器會自動切換到這種模式。但是如果你想用目前數據查看它,你可以嘗試使用提示:

SELECT max(m.id), mt.other 
FROM messages_thread mt USE INDEX(messages_thread_creator_index)
join messages m ON mt.id = m.threadId 
WHERE mt.creator=1 
group by mt.other;

或創建附加索引:

create index mt_creator_thread_id_index on messages_thread(creator, id);

(但後者不太可能影響小數據的執行計劃。)

額外的:

實際上,在我的系統上(5.7 有空表),計劃 2 不使用帶或不帶的全表掃描USE INDEX (message_threadId_id_index),它基於以下索引m2訪問:<auto_key0>``mt.id

mysql> explain SELECT m2.id, mt.other  FROM messages_thread mt  inner join ( select max(id) id, threadId             from messages m             group by threadId) as m2 on mt.id=m2.threadId WHERE mt.creator=1\G
*************************** 1. row ***************************
          id: 1
 select_type: PRIMARY
       table: mt
  partitions: NULL
        type: ref
possible_keys: PRIMARY,messages_thread_creator_index
         key: messages_thread_creator_index
     key_len: 3
         ref: const
        rows: 1
    filtered: 100.00
       Extra: NULL
*************************** 2. row ***************************
          id: 1
 select_type: PRIMARY
       table: <derived2>
  partitions: NULL
        type: ref
possible_keys: <auto_key0>
         key: <auto_key0>
     key_len: 258
         ref: test.mt.id
        rows: 2
    filtered: 100.00
       Extra: NULL
*************************** 3. row ***************************
          id: 2
 select_type: DERIVED
       table: m
  partitions: NULL
        type: index
possible_keys: messages_threadId_status_index,message_threadId_id_index
         key: messages_threadId_status_index
     key_len: 260
         ref: NULL
        rows: 1
    filtered: 100.00
       Extra: Using index


mysql> explain SELECT m2.id, mt.other  FROM messages_thread mt  inner join ( select max(id) id, threadId             from messages m USE INDEX (message_threadId_id_index)            group by threadId) as m2 on mt.id=m2.threadId WHERE mt.creator=1\G
*************************** 1. row ***************************
          id: 1
 select_type: PRIMARY
       table: mt
  partitions: NULL
        type: ref
possible_keys: PRIMARY,messages_thread_creator_index
         key: messages_thread_creator_index
     key_len: 3
         ref: const
        rows: 1
    filtered: 100.00
       Extra: NULL
*************************** 2. row ***************************
          id: 1
 select_type: PRIMARY
       table: <derived2>
  partitions: NULL
        type: ref
possible_keys: <auto_key0>
         key: <auto_key0>
     key_len: 258
         ref: test.mt.id
        rows: 2
    filtered: 100.00
       Extra: NULL
*************************** 3. row ***************************
          id: 2
 select_type: DERIVED
       table: m
  partitions: NULL
        type: index
possible_keys: messages_threadId_status_index,message_threadId_id_index
         key: message_threadId_id_index
     key_len: 266
         ref: NULL
        rows: 1
    filtered: 100.00
       Extra: Using index

它仍然顯示Rows: 2派生表,但這可能是查詢優化器做出的估計,而不是它實際上為每個 id 值讀取兩條記錄。

引用自:https://dba.stackexchange.com/questions/250051