MariaDB 10.4 查詢優化器錯誤的執行計劃

December 13, 2020

我對 MariaDB 伺服器上的某些表和查詢優化器有一個非常奇怪的行為。

首先，我們有我們的主表，我們稱它為巨大的，我們在同一個數據庫上有另一個表，它是他的複製，行數有限。

巨大的桌子有一個範圍

+--------------------+--------------------+
| min(dispatch_time) | max(dispatch_time) |
+--------------------+--------------------+
| 20070402114058     | 20201207000108     |
+--------------------+--------------------+

有計數行

+----------+
| count(*) |
+----------+
| 46683586 |
+----------+

複製一個有一個範圍

+--------------------+--------------------+
| min(dispatch_time) | max(dispatch_time) |
+--------------------+--------------------+
| 20190101143607     | 20201207000108     |
+--------------------+--------------------+

有計數行

+----------+
| count(*) |
+----------+
| 10346027 |
+----------+

它們具有相同的索引

巨大的

+-----------+------------+----------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table     | Non_unique | Key_name       | Seq_in_index | Column_name   | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-----------+------------+----------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| table1 |          0 | PRIMARY        |            1 | order_id      | A         |    44735742 |     NULL | NULL   |      | BTREE      |         |               |
| table1 |          1 | Index_Customer |            1 | customer_id   | A         |    11183935 |     NULL | NULL   | YES  | BTREE      |         |               |
| table1 |          1 | Index_3        |            1 | dispatch_time | A         |    44735742 |     NULL | NULL   | YES  | BTREE      |         |               |
+-----------+------------+----------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

複製更小的一個

+-----------+------------+----------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table     | Non_unique | Key_name       | Seq_in_index | Column_name   | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-----------+------------+----------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| table1 |          0 | PRIMARY        |            1 | order_id      | A         |    10346027 |     NULL | NULL   |      | BTREE      |         |               |
| table1 |          1 | Index_Customer |            1 | customer_id   | A         |     2041159 |     NULL | NULL   | YES  | BTREE      |         |               |
| table1 |          1 | Index_3        |            1 | dispatch_time | A         |     8070853 |     NULL | NULL   | YES  | BTREE      |         |               |
+-----------+------------+----------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

現在問題出在這個簡單的特定查詢上。

如果我們執行這個，在巨大的一張桌子上

EXPLAIN SELECT * FROM   `table1` WHERE
    ( `dispatch_time` BETWEEN '20190201' AND '20190601' );
+------+-------------+-----------+-------+---------------+---------+---------+------+---------+-----------------------+
| id   | select_type | table     | type  | possible_keys | key     | key_len | ref  | rows    | Extra                 |
+------+-------------+-----------+-------+---------------+---------+---------+------+---------+-----------------------+
|    1 | SIMPLE      | table1    | range | Index_3       | Index_3 | 15      | NULL | 5201896 | Using index condition |
+------+-------------+-----------+-------+---------------+---------+---------+------+---------+-----------------------+


SELECT SQL_NO_CACHE * FROM   `table1` WHERE
    ( `dispatch_time` BETWEEN '20190201' AND '20190601' );
1695926 rows in set (21.730 sec)

到現在為止還挺好。它使用類型範圍，使用索引條件，結果行時間是可以接受的，一切都很好。

但是在較小的情況下，查看相同查詢會發生什麼

EXPLAIN SELECT * FROM   `table1` WHERE
    ( `dispatch_time` BETWEEN '20190201' AND '20190601' );
+------+-------------+-----------+------+---------------+------+---------+------+----------+-------------+
| id   | select_type | table     | type | possible_keys | key  | key_len | ref  | rows     | Extra       |
+------+-------------+-----------+------+---------------+------+---------+------+----------+-------------+
|    1 | SIMPLE      | table1    | ALL  | Index_3       | NULL | NULL    | NULL | 10346027 | Using where |
+------+-------------+-----------+------+---------------+------+---------+------+----------+-------------+

SELECT SQL_NO_CACHE * FROM   `table1` WHERE
    ( `dispatch_time` BETWEEN '20190201' AND '20190601' );
1695926 rows in set (39.470 sec)

它執行全表掃描，類型為 ALL 且未使用索引。

查詢優化器不會因為成本而選擇索引，因為我已經做了優化器跟踪，這是有問題的部分。

在巨大的桌子上

{\
                       "index": "Index_3",\
                       "ranges": ["(20190201) &lt;= (dispatch_time) &lt;= (20190601)"],\
                       "rowid_ordered": false,\
                       "using_mrr": false,\
                       "index_only": false,\
                       "rows": 5201896,\
                       "cost": 6.51e6,\
                       "chosen": true\
}

在複製較小的一個

{\
                       "index": "Index_3",\
                       "ranges": ["(20190201) &lt;= (dispatch_time) &lt;= (20190601)"],\
                       "rowid_ordered": false,\
                       "using_mrr": false,\
                       "index_only": false,\
                       "rows": 3375750,\
                       "cost": 4.23e6,\
                       "chosen": false,\
                       "cause": "cost"\
}

我得出的結論是 CLONE 表的基數不會使查詢優化器使用索引，但事情是為什麼？

為什麼要執行並以這種方式執行並在較小的表上進行全表掃描，儘管索引在那裡？如何告訴優化器改變計劃？如果使用強制索引，它使用索引，結果行時間類似於 HUGE 表。

SELECT SQL_NO_CACHE * FROM   `table1` FORCE INDEX (Index_3) WHERE ( `dispatch_time` BETWEEN '20190201' AND '20190601' );

我已經為這張桌子上的所有人做了多次分析表，沒有任何改變。嘗試了各種調整等，但執行計劃始終不使用小型 CLONE 表上的索引條件。

有人有想法嗎？

謝謝你。

編輯文章：

顯示創建表

巨大的

table1 | CREATE TABLE `table1` (
 `order_id` int(11) NOT NULL AUTO_INCREMENT,
 `customer_id` int(11) DEFAULT NULL,
 `client_id` int(11) DEFAULT NULL,
 `table_id` int(11) DEFAULT NULL,
 `user_id` int(11) DEFAULT NULL,
 `codename` int(11) DEFAULT NULL,
 `start_time` char(14) DEFAULT NULL,
 `dispatch_time` char(14) DEFAULT NULL,
 `change_time` char(14) DEFAULT NULL,
 `buffet_time` char(14) DEFAULT NULL,
 `receipt_time` char(14) DEFAULT NULL,
 `delivery_time` char(14) DEFAULT NULL,
 `client_time` char(14) DEFAULT NULL,
 `return_time` char(14) DEFAULT NULL,
 `expected_time` char(14) DEFAULT NULL,
 `completion_time` char(14) DEFAULT NULL,
 `total` double DEFAULT NULL,
 `promotion` double DEFAULT NULL,
 `takeaway` int(11) DEFAULT NULL,
 `esan` int(11) DEFAULT NULL,
 `destroy` int(11) DEFAULT NULL,
 `person` int(11) DEFAULT NULL,
 `valid` int(11) DEFAULT NULL,
 `returned` int(11) DEFAULT NULL,
 `invoice` int(11) DEFAULT NULL,
 `discount` double DEFAULT NULL,
 `discountS` double DEFAULT NULL,
 `policy` int(11) DEFAULT NULL,
 `packing` int(11) DEFAULT NULL,
 `production` int(11) DEFAULT NULL,
 `vitrine` int(11) DEFAULT NULL,
 `sch_start` char(14) DEFAULT NULL,
 `sch_finish` char(14) DEFAULT NULL,
 `batch_time` char(14) DEFAULT NULL,
 `comments` varchar(255) DEFAULT NULL,
 `preorder` int(11) DEFAULT NULL,
 PRIMARY KEY (`order_id`),
 KEY `Index_Customer` (`customer_id`),
 KEY `Index_3` (`dispatch_time`)
) ENGINE=InnoDB AUTO_INCREMENT=46739244 DEFAULT CHARSET=greek COMMENT='InnoDB free: 12288 kB'

複製一個

table1 | CREATE TABLE `table1` (
 `order_id` int(11) NOT NULL AUTO_INCREMENT,
 `customer_id` int(11) DEFAULT NULL,
 `client_id` int(11) DEFAULT NULL,
 `table_id` int(11) DEFAULT NULL,
 `user_id` int(11) DEFAULT NULL,
 `codename` int(11) DEFAULT NULL,
 `start_time` char(14) DEFAULT NULL,
 `dispatch_time` char(14) DEFAULT NULL,
 `change_time` char(14) DEFAULT NULL,
 `buffet_time` char(14) DEFAULT NULL,
 `receipt_time` char(14) DEFAULT NULL,
 `delivery_time` char(14) DEFAULT NULL,
 `client_time` char(14) DEFAULT NULL,
 `return_time` char(14) DEFAULT NULL,
 `expected_time` char(14) DEFAULT NULL,
 `completion_time` char(14) DEFAULT NULL,
 `total` double DEFAULT NULL,
 `promotion` double DEFAULT NULL,
 `takeaway` int(11) DEFAULT NULL,
 `esan` int(11) DEFAULT NULL,
 `destroy` int(11) DEFAULT NULL,
 `person` int(11) DEFAULT NULL,
 `valid` int(11) DEFAULT NULL,
 `returned` int(11) DEFAULT NULL,
 `invoice` int(11) DEFAULT NULL,
 `discount` double DEFAULT NULL,
 `discountS` double DEFAULT NULL,
 `policy` int(11) DEFAULT NULL,
 `packing` int(11) DEFAULT NULL,
 `production` int(11) DEFAULT NULL,
 `vitrine` int(11) DEFAULT NULL,
 `sch_start` char(14) DEFAULT NULL,
 `sch_finish` char(14) DEFAULT NULL,
 `batch_time` char(14) DEFAULT NULL,
 `comments` varchar(255) DEFAULT NULL,
 `preorder` int(11) DEFAULT NULL,
 PRIMARY KEY (`order_id`),
 KEY `Index_Customer` (`customer_id`),
 KEY `Index_3` (`dispatch_time`)
) ENGINE=InnoDB AUTO_INCREMENT=46739244 DEFAULT CHARSET=greek COMMENT='InnoDB free: 12288 kB'

正如您提到的，由於cardinality兩張表之間的差異。聽起來你明白cardinality表面上是什麼，但要回答你的問題“為什麼？” 讓我先提供更多資訊。
簡而言之，cardinality它是對 a 中給定值的唯一性的度量，Table換句話說，它衡量一個值相對於 that 中的總值的出現次數Table。SQL 引擎會儲存您statistics的cardinalities每個值的相關資訊，Tables因此它可以在以後查詢時決定以最有效的方式提供該數據。
當您編寫查詢時，（和子句predicates中的條件值）用於根據這些值過濾數據。因此，SQL 引擎將其用於您的那些特定值來決定哪種類型的性能最高，例如具有低（低唯一性，因此大量記錄包含該值）的東西 a有意義，而不是（在一個 B 樹），它會出現在一個高（高唯一性，不是很多行包含該值）的值上。WHERE``JOIN``Table``statistics``predicates``Execution Plan``cardinality``full scan``index seek``cardinality
話雖如此，直接回答您的問題“為什麼？” 是因為index seek對大量值的操作（相對而言）通常比 a 慢得多full scan。儘管引擎並不完全完美，但這就是index hints存在的原因，因為您可能比引擎更了解數據，並且有時暗示它通常想要做的事情是糾正此類小錯誤的方法。

引用自：https://dba.stackexchange.com/questions/281399

MariaDB 10.4 查詢優化器錯誤的執行計劃

相關問答

MariaDB 查詢優化器忽略索引

實際行和估計行差異很大

由於連接了 2 個良好估計的結果，如何更正行估計

MariaDB 想在 Aria 上使用奇怪且效率極低的查詢計劃，而不是在 InnoDB 上（5 秒與 1 小時以上的查詢）

SQL Server 2014：自聯接基數估計不一致的任何解釋？

排序時的錯誤估計，沒有排序的好估計