Query-Performance
MariaDB 想在 Aria 上使用奇怪且效率極低的查詢計劃,而不是在 InnoDB 上(5 秒與 1 小時以上的查詢)
我在 MariaDB 10.4.14 系統上按預期執行查詢時遇到了一些麻煩。
我們最近將數據庫從 InnoDB 遷移到 Aria,因此我們可以將 INDEX 單獨儲存在快速 PCI-e SSD 上,從而受益於索引搜尋性能的提高。
雖然到目前為止 Aria 的性能有了很大的提升,但只有一個查詢絕對不想合作。
在對錶執行 ANALYZE 命令後,此查詢將像以往一樣高效執行,但是當它接收到活動插入時,查詢計劃程序需要大約 10 分鐘才能再次失去控制。
我們在 InnoDB 上沒有這些問題。
無論如何,查詢如下:
SELECT data.*, urls.url, file.timestamp, source.location FROM data JOIN urls ON data.urlid = urls.id JOIN file ON data.fileid = file.id JOIN source ON data.sourceid = source.id WHERE urlid IN (SELECT id FROM urls WHERE vurl LIKE REVERSE('%.example.com')) ORDER BY timestamp DESC;
我期望它看起來像什麼:
+------+-------------+--------+--------+-----------------------+---------+---------+--------------------+------+-----------------------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+--------+--------+-----------------------+---------+---------+--------------------+------+-----------------------------------------------------------+ | 1 | PRIMARY | urls | range | PRIMARY,vurl | vurl | 765 | NULL | 4 | Using where; Using index; Using temporary; Using filesort | | 1 | PRIMARY | urls | eq_ref | PRIMARY | PRIMARY | 8 | data.urls.id | 1 | | | 1 | PRIMARY | data | ref | sourceid,urlid,fileid | urlid | 8 | data.urls.id | 4 | Using where | | 1 | PRIMARY | file | ref | PRIMARY | PRIMARY | 8 | data.data.fileid | 1 | | | 1 | PRIMARY | source | eq_ref | PRIMARY | PRIMARY | 2 | data.data.sourceid | 1 | | +------+-------------+--------+--------+-----------------------+---------+---------+--------------------+------+-----------------------------------------------------------+
它看起來像什麼(沒有分析):
+------+-------------+--------+--------+-----------------------+----------+---------+------------------+---------+---------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+--------+--------+-----------------------+----------+---------+------------------+---------+---------------------------------+ | 1 | PRIMARY | source | ALL | PRIMARY | NULL | NULL | NULL | 15 | Using temporary; Using filesort | | 1 | PRIMARY | data | ref | sourceid,urlid,fileid | sourceid | 2 | data.source.id | 197803 | Using where | | 1 | PRIMARY | urls | eq_ref | PRIMARY | PRIMARY | 8 | data.data.urlid | 1 | | | 1 | PRIMARY | urls | eq_ref | PRIMARY,vurl | PRIMARY | 8 | data.data.urlid | 1 | Using where | | 1 | PRIMARY | file | ref | PRIMARY | PRIMARY | 8 | data.data.fileid | 7907384 | | +------+-------------+--------+--------+-----------------------+----------+---------+------------------+---------+---------------------------------+
同樣,這種行為僅在我們聲明使用 Aria 時才會出現,並且不會出現在我們從中遷移的 InnoDB 表中。我對 FORCE INDEX 或 IGNORE INDEX 有點修復感到更滿意,但由於查詢的複雜性,我在找到要使用的正確命令時遇到了一些麻煩。
誰能幫我解決這個問題?知道是什麼原因造成的嗎?(這是一個錯誤嗎?)
編輯:表格佈局如下:
CREATE TABLE `data` ( `id` BIGINT(20) UNSIGNED NOT NULL AUTO_INCREMENT, `entryid` INT(10) UNSIGNED NOT NULL DEFAULT 0, `sourceid` SMALLINT(5) UNSIGNED NOT NULL DEFAULT 1, `urlid` BIGINT(20) UNSIGNED NOT NULL DEFAULT 1, `fileid` BIGINT(20) UNSIGNED NULL DEFAULT NULL, PRIMARY KEY (`id`), UNIQUE INDEX `entryid_sourceid_urlid` (`entryid`, `sourceid`, `urlid`), INDEX `entryid` (`entryid`), INDEX `sourceid` (`sourceid`), INDEX `urlid` (`urlid`), INDEX `fileid` (`fileid`) ) COLLATE='utf8_bin' ENGINE=InnoDB; CREATE TABLE `file` ( `id` BIGINT(20) UNSIGNED NOT NULL AUTO_INCREMENT, `file` BLOB NOT NULL DEFAULT '', `hash` BINARY(28) NOT NULL DEFAULT '\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0' COMMENT 'SHA-224 hash of file in binary', `timestamp` INT(10) UNSIGNED NULL DEFAULT NULL, `last` INT(10) UNSIGNED NOT NULL, PRIMARY KEY (`id`, `timestamp`), INDEX `hash` (`hash`(5)) USING HASH, INDEX `timestamp` (`timestamp`), INDEX `last` (`last`) ) COLLATE='utf8_bin' ENGINE=Aria PARTITION BY RANGE (`timestamp`) (PARTITION `p2017` VALUES LESS THAN (1514761200) DATA DIRECTORY = '/data/mysql' ENGINE = Aria, PARTITION `p2018` VALUES LESS THAN (1546297200) DATA DIRECTORY = '/data/mysql' ENGINE = Aria, PARTITION `p2019` VALUES LESS THAN (1577833200) DATA DIRECTORY = '/data/mysql' ENGINE = Aria, PARTITION `p2020` VALUES LESS THAN (1609455600) DATA DIRECTORY = '/data/mysql' ENGINE = Aria, PARTITION `p2021` VALUES LESS THAN (1640991600) DATA DIRECTORY = '/data/mysql' ENGINE = Aria, PARTITION `p2022` VALUES LESS THAN (1672527600) DATA DIRECTORY = '/data/mysql' ENGINE = Aria, PARTITION `p2023` VALUES LESS THAN (1704063600) DATA DIRECTORY = '/data/mysql' ENGINE = Aria, PARTITION `p2024` VALUES LESS THAN (1735686000) DATA DIRECTORY = '/data/mysql' ENGINE = Aria, PARTITION `p2025` VALUES LESS THAN (1767222000) DATA DIRECTORY = '/data/mysql' ENGINE = Aria, PARTITION `p2026` VALUES LESS THAN (1798758000) DATA DIRECTORY = '/data/mysql' ENGINE = Aria, PARTITION `p2027` VALUES LESS THAN (1830294000) DATA DIRECTORY = '/data/mysql' ENGINE = Aria, PARTITION `p2028` VALUES LESS THAN (1861916400) DATA DIRECTORY = '/data/mysql' ENGINE = Aria, PARTITION `p2029` VALUES LESS THAN (1893452400) DATA DIRECTORY = '/data/mysql' ENGINE = Aria, PARTITION `p2030` VALUES LESS THAN (1924988400) DATA DIRECTORY = '/data/mysql' ENGINE = Aria, PARTITION `p2031` VALUES LESS THAN (1956524400) DATA DIRECTORY = '/data/mysql' ENGINE = Aria); CREATE TABLE `source` ( `id` SMALLINT(5) UNSIGNED NOT NULL AUTO_INCREMENT, `source` TINYTEXT NOT NULL DEFAULT '' COLLATE 'utf8_bin', `last` INT(10) UNSIGNED NULL DEFAULT 0 COMMENT 'Last received item from this source', `updated` TIMESTAMP NULL DEFAULT current_timestamp() ON UPDATE current_timestamp() COMMENT 'Last time the last value was updated', PRIMARY KEY (`id`), UNIQUE INDEX `source` (`source`) USING HASH ) COLLATE='utf8_bin' ENGINE=InnoDB; CREATE TABLE `urls` ( `id` BIGINT(20) UNSIGNED NOT NULL AUTO_INCREMENT, `url` VARCHAR(254) NOT NULL COLLATE 'utf8_bin', `vurl` VARCHAR(254) AS (reverse(`url`)) VIRTUAL, PRIMARY KEY (`id`), UNIQUE INDEX `url` (`url`), INDEX `vurl` (`vurl`) ) COLLATE='utf8_bin' ENGINE=InnoDB;
我通過更改要
STRAIGHT_JOIN
在file
andsource
表上使用的查詢解決了這個問題:SELECT data.*, urls.url, file.timestamp, source.location FROM data JOIN urls ON data.urlid = urls.id STRAIGHT_JOIN file ON data.fileid = file.id STRAIGHT_JOIN source ON data.sourceid = source.id WHERE urlid IN (SELECT id FROM urls WHERE vurl LIKE REVERSE('%.example.com')) ORDER BY timestamp DESC;
我發現那裡沒有很多人在談論
STRAIGHT_JOIN
’,所以我最初在搜尋時錯過了它。在MySQL Reference中,關於它的說法如下:
STRAIGHT_JOIN 與 JOIN 類似,只是左表總是在右表之前讀取。這可用於連接優化器以次優順序處理表的那些(少數)情況。
在 MariaDB 知識庫中,有一篇文章提到它:https ://mariadb.com/kb/en/index-hints-how-to-force-query-plans/
我強烈建議閱讀它,因為這個頁面描述了一些強制不同查詢計劃的方法。