Query-Performance

MariaDB 想在 Aria 上使用奇怪且效率極低的查詢計劃,而不是在 InnoDB 上(5 秒與 1 小時以上的查詢)

  • August 25, 2020

我在 MariaDB 10.4.14 系統上按預期執行查詢時遇到了一些麻煩。

我們最近將數據庫從 InnoDB 遷移到 Aria,因此我們可以將 INDEX 單獨儲存在快速 PCI-e SSD 上,從而受益於索引搜尋性能的提高。

雖然到目前為止 Aria 的性能有了很大的提升,但只有一個查詢絕對不想合作。

在對錶執行 ANALYZE 命令後,此查詢將像以往一樣高效執行,但是當它接收到活動插入時,查詢計劃程序需要大約 10 分鐘才能再次失去控制。

我們在 InnoDB 上沒有這些問題。

無論如何,查詢如下:

SELECT 
 data.*, 
 urls.url, 
 file.timestamp, 
 source.location 
FROM data JOIN urls ON data.urlid = urls.id 
JOIN file ON data.fileid = file.id 
JOIN source ON data.sourceid = source.id  
WHERE urlid IN (SELECT id FROM urls WHERE vurl 
LIKE REVERSE('%.example.com')) 
ORDER BY timestamp DESC;

我期望它看起來像什麼:

+------+-------------+--------+--------+-----------------------+---------+---------+--------------------+------+-----------------------------------------------------------+
| id   | select_type | table  | type   | possible_keys         | key     | key_len | ref                | rows | Extra                                                     |
+------+-------------+--------+--------+-----------------------+---------+---------+--------------------+------+-----------------------------------------------------------+
|    1 | PRIMARY     | urls   | range  | PRIMARY,vurl          | vurl    | 765     | NULL               | 4    | Using where; Using index; Using temporary; Using filesort |
|    1 | PRIMARY     | urls   | eq_ref | PRIMARY               | PRIMARY | 8       | data.urls.id       | 1    |                                                           |
|    1 | PRIMARY     | data   | ref    | sourceid,urlid,fileid | urlid   | 8       | data.urls.id       | 4    | Using where                                               |
|    1 | PRIMARY     | file   | ref    | PRIMARY               | PRIMARY | 8       | data.data.fileid   | 1    |                                                           |
|    1 | PRIMARY     | source | eq_ref | PRIMARY               | PRIMARY | 2       | data.data.sourceid | 1    |                                                           |
+------+-------------+--------+--------+-----------------------+---------+---------+--------------------+------+-----------------------------------------------------------+

它看起來像什麼(沒有分析):

+------+-------------+--------+--------+-----------------------+----------+---------+------------------+---------+---------------------------------+
| id   | select_type | table  | type   | possible_keys         | key      | key_len | ref              | rows    | Extra                           |
+------+-------------+--------+--------+-----------------------+----------+---------+------------------+---------+---------------------------------+
|    1 | PRIMARY     | source | ALL    | PRIMARY               | NULL     | NULL    | NULL             | 15      | Using temporary; Using filesort |
|    1 | PRIMARY     | data   | ref    | sourceid,urlid,fileid | sourceid | 2       | data.source.id   | 197803  | Using where                     |
|    1 | PRIMARY     | urls   | eq_ref | PRIMARY               | PRIMARY  | 8       | data.data.urlid  | 1       |                                 |
|    1 | PRIMARY     | urls   | eq_ref | PRIMARY,vurl          | PRIMARY  | 8       | data.data.urlid  | 1       | Using where                     |
|    1 | PRIMARY     | file   | ref    | PRIMARY               | PRIMARY  | 8       | data.data.fileid | 7907384 |                                 |
+------+-------------+--------+--------+-----------------------+----------+---------+------------------+---------+---------------------------------+

同樣,這種行為僅在我們聲明使用 Aria 時才會出現,並且不會出現在我們從中遷移的 InnoDB 表中。我對 FORCE INDEX 或 IGNORE INDEX 有點修復感到更滿意,但由於查詢的複雜性,我在找到要使用的正確命令時遇到了一些麻煩。

誰能幫我解決這個問題?知道是什麼原因造成的嗎?(這是一個錯誤嗎?)

編輯:表格佈局如下:

CREATE TABLE `data` (
   `id` BIGINT(20) UNSIGNED NOT NULL AUTO_INCREMENT,
   `entryid` INT(10) UNSIGNED NOT NULL DEFAULT 0,
   `sourceid` SMALLINT(5) UNSIGNED NOT NULL DEFAULT 1,
   `urlid` BIGINT(20) UNSIGNED NOT NULL DEFAULT 1,
   `fileid` BIGINT(20) UNSIGNED NULL DEFAULT NULL,
   PRIMARY KEY (`id`),
   UNIQUE INDEX `entryid_sourceid_urlid` (`entryid`, `sourceid`, `urlid`),
   INDEX `entryid` (`entryid`),
   INDEX `sourceid` (`sourceid`),
   INDEX `urlid` (`urlid`),
   INDEX `fileid` (`fileid`)
)
COLLATE='utf8_bin'
ENGINE=InnoDB;

CREATE TABLE `file` (
   `id` BIGINT(20) UNSIGNED NOT NULL AUTO_INCREMENT,
   `file` BLOB NOT NULL DEFAULT '',
   `hash` BINARY(28) NOT NULL DEFAULT '\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0' COMMENT 'SHA-224 hash of file in binary',
   `timestamp` INT(10) UNSIGNED NULL DEFAULT NULL,
   `last` INT(10) UNSIGNED NOT NULL,
   PRIMARY KEY (`id`, `timestamp`),
   INDEX `hash` (`hash`(5)) USING HASH,
   INDEX `timestamp` (`timestamp`),
   INDEX `last` (`last`)
)
COLLATE='utf8_bin'
ENGINE=Aria
PARTITION BY RANGE (`timestamp`)
(PARTITION `p2017` VALUES LESS THAN (1514761200) DATA DIRECTORY = '/data/mysql' ENGINE = Aria,
PARTITION `p2018` VALUES LESS THAN (1546297200) DATA DIRECTORY = '/data/mysql' ENGINE = Aria,
PARTITION `p2019` VALUES LESS THAN (1577833200) DATA DIRECTORY = '/data/mysql' ENGINE = Aria,
PARTITION `p2020` VALUES LESS THAN (1609455600) DATA DIRECTORY = '/data/mysql' ENGINE = Aria,
PARTITION `p2021` VALUES LESS THAN (1640991600) DATA DIRECTORY = '/data/mysql' ENGINE = Aria,
PARTITION `p2022` VALUES LESS THAN (1672527600) DATA DIRECTORY = '/data/mysql' ENGINE = Aria,
PARTITION `p2023` VALUES LESS THAN (1704063600) DATA DIRECTORY = '/data/mysql' ENGINE = Aria,
PARTITION `p2024` VALUES LESS THAN (1735686000) DATA DIRECTORY = '/data/mysql' ENGINE = Aria,
PARTITION `p2025` VALUES LESS THAN (1767222000) DATA DIRECTORY = '/data/mysql' ENGINE = Aria,
PARTITION `p2026` VALUES LESS THAN (1798758000) DATA DIRECTORY = '/data/mysql' ENGINE = Aria,
PARTITION `p2027` VALUES LESS THAN (1830294000) DATA DIRECTORY = '/data/mysql' ENGINE = Aria,
PARTITION `p2028` VALUES LESS THAN (1861916400) DATA DIRECTORY = '/data/mysql' ENGINE = Aria,
PARTITION `p2029` VALUES LESS THAN (1893452400) DATA DIRECTORY = '/data/mysql' ENGINE = Aria,
PARTITION `p2030` VALUES LESS THAN (1924988400) DATA DIRECTORY = '/data/mysql' ENGINE = Aria,
PARTITION `p2031` VALUES LESS THAN (1956524400) DATA DIRECTORY = '/data/mysql' ENGINE = Aria);

CREATE TABLE `source` (
   `id` SMALLINT(5) UNSIGNED NOT NULL AUTO_INCREMENT,
   `source` TINYTEXT NOT NULL DEFAULT '' COLLATE 'utf8_bin',
   `last` INT(10) UNSIGNED NULL DEFAULT 0 COMMENT 'Last received item from this source',
   `updated` TIMESTAMP NULL DEFAULT current_timestamp() ON UPDATE current_timestamp() COMMENT 'Last time the last value was updated',
   PRIMARY KEY (`id`),
   UNIQUE INDEX `source` (`source`) USING HASH
)
COLLATE='utf8_bin'
ENGINE=InnoDB;

CREATE TABLE `urls` (
   `id` BIGINT(20) UNSIGNED NOT NULL AUTO_INCREMENT,
   `url` VARCHAR(254) NOT NULL COLLATE 'utf8_bin',
   `vurl` VARCHAR(254) AS (reverse(`url`)) VIRTUAL,
   PRIMARY KEY (`id`),
   UNIQUE INDEX `url` (`url`),
   INDEX `vurl` (`vurl`)
)
COLLATE='utf8_bin'
ENGINE=InnoDB;

我通過更改要STRAIGHT_JOINfileandsource表上使用的查詢解決了這個問題:

SELECT 
 data.*, 
 urls.url, 
 file.timestamp, 
 source.location 
FROM data JOIN urls ON data.urlid = urls.id 
STRAIGHT_JOIN file ON data.fileid = file.id 
STRAIGHT_JOIN source ON data.sourceid = source.id  
WHERE urlid IN (SELECT id FROM urls WHERE vurl 
LIKE REVERSE('%.example.com')) 
ORDER BY timestamp DESC;

我發現那裡沒有很多人在談論STRAIGHT_JOIN’,所以我最初在搜尋時錯過了它。

MySQL Reference中,關於它的說法如下:

STRAIGHT_JOIN 與 JOIN 類似,只是左表總是在右表之前讀取。這可用於連接優化器以次優順序處理表的那些(少數)情況。

在 MariaDB 知識庫中,有一篇文章提到它:https ://mariadb.com/kb/en/index-hints-how-to-force-query-plans/

我強烈建議閱讀它,因為這個頁面描述了一些強制不同查詢計劃的方法。

引用自:https://dba.stackexchange.com/questions/273897