Mysql
從不使用以 DATETIME 作為複合鍵第一部分的主鍵索引
我在將日期時間(甚至日期)作為主鍵的第一部分進行索引時遇到問題。
我使用 MySQL 5.5
這是我的兩張表:
-- This is my standard table with dateDim as a dateTime CREATE TABLE `stats` ( `dateDim` datetime NOT NULL, `accountDim` mediumint(8) unsigned NOT NULL, `execCodeDim` smallint(5) unsigned NOT NULL, `operationTypeDim` tinyint(3) unsigned NOT NULL, `junkDim` tinyint(3) unsigned NOT NULL, `ipCountryDim` smallint(5) unsigned NOT NULL, `count` int(10) unsigned NOT NULL, `amount` bigint(20) NOT NULL, PRIMARY KEY (`dateDim`,`accountDim`,`execCodeDim`,`operationTypeDim`,`junkDim`,`ipCountryDim`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 -- Here is a copy with datDim as an integer CREATE TABLE `stats_todays` ( `dateDim` int(11) unsigned NOT NULL, `accountDim` mediumint(8) unsigned NOT NULL, `execCodeDim` smallint(5) unsigned NOT NULL, `operationTypeDim` tinyint(3) unsigned NOT NULL, `junkDim` tinyint(3) unsigned NOT NULL, `ipCountryDim` smallint(5) unsigned NOT NULL, `count` int(10) unsigned NOT NULL, `amount` bigint(20) NOT NULL, PRIMARY KEY (`dateDim`,`accountDim`,`execCodeDim`,`operationTypeDim`,`junkDim`,`ipCountryDim`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8
我用完全相同的數據填充兩個表(接近 10 000 000)
但:
- stats 表對 dateDim 使用 DATETIME
- stats_todays 使用 un INTEGER 和 TO_DAYS() 作為 dateDim
我的問題是:當索引的第一部分是日期時間時,為什麼 MySQL 不使用主鍵???這很奇怪,因為使用相同的數據但與 INTEGER 和 TO_DAYS(dateDim) 合併後,相同的請求會動搖……
統計表(和日期時間)的範例:
SELECT * FROM `stats` WHERE dateDim = '2014-04-03 00:00:00' AND accountDim = 4 AND execCodeDim = 9 AND operationTypeDim = 1 AND junkDim = 5 AND ipCountryDim = 3 => 1 result (4.5sec) Explain: id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE stats ALL NULL NULL NULL NULL 8832329 Using where
對其他表 stats_todays 的相同請求(使用 INTEGER 和 TO_DAYS() )
EXPLAIN SELECT * FROM `stats_todays` WHERE dateDim = TO_DAYS('2014-04-03 00:00:00') AND accountDim = 4 AND execCodeDim = 9 AND operationTypeDim = 1 AND junkDim = 5 AND ipCountryDim = 3 => Result 1 row (0.0003 sec) Explain: id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE stats_todays const PRIMARY PRIMARY 13 const,const,const,const,const,const 1
如果您閱讀了完整的文章,您就會明白這不是一個低基數問題,因為請求使用與 INTEGER dateDim 欄位完全相同的基數…。
以下是一些高級細節:
SELECT COUNT( DISTINCT dateDim ) FROM stats_todays UNION ALL SELECT COUNT( DISTINCT dateDim ) FROM stats; Result: COUNT(DISTINCT dateDim) 2192 2192
這是索引說明:
SHOW INDEXES FROM `stats` Table Non_unique Key_name Seq_in_index Column_name Collation Cardinality Sub_part Packed Null Index_type Comment Index_comment stats 0 PRIMARY 1 dateDim A 6921 NULL NULL BTREE stats 0 PRIMARY 2 accountDim A 883232 NULL NULL BTREE stats 0 PRIMARY 3 execCodeDim A 8832329 NULL NULL BTREE stats 0 PRIMARY 4 operationTypeDim A 8832329 NULL NULL BTREE stats 0 PRIMARY 5 junkDim A 8832329 NULL NULL BTREE stats 0 PRIMARY 6 ipCountryDim A 8832329 NULL NULL BTREE SHOW INDEXES FROM `stats_todays` Table Non_unique Key_name Seq_in_index Column_name Collation Cardinality Sub_part Packed Null Index_type Comment Index_comment stats_todays 0 PRIMARY 1 dateDim A 7518 NULL NULL BTREE stats_todays 0 PRIMARY 2 accountDim A 4022582 NULL NULL BTREE stats_todays 0 PRIMARY 3 execCodeDim A 8045164 NULL NULL BTREE stats_todays 0 PRIMARY 4 operationTypeDim A 8045164 NULL NULL BTREE stats_todays 0 PRIMARY 5 junkDim A 8045164 NULL NULL BTREE stats_todays 0 PRIMARY 6 ipCountryDim A 8045164 NULL NULL BTREE
SELECT dateDim, COUNT(*) FROM stats GROUP BY dateDim WITH ROLLUP
- 告訴有 2192 個不同的日期,並且重新分區是平滑的(按日期大約 3000 - 4000 行)
- 表中有 8 831 990 行
- 另一張桌子也一樣
- 我嘗試使用 COVERING INDEX(將 * 替換為所有 PK 列)=> 沒有任何改變
- 我試過 force|use index => 沒有任何改變
- 與日期欄位相同,而不是日期時間
- 與 INDEX 或 UNIQUE 相同,而不是主鍵
這是 5.5.x 中的一個錯誤。看這裡
這表明您的查詢應該是
SELECT * FROM `stats` WHERE dateDim = CAST('2014-04-03 00:00:00' as datetime) AND accountDim = 4 AND execCodeDim = 9 AND operationTypeDim = 1 AND junkDim = 5 AND ipCountryDim = 3
自 int 版本的表
CREATE TABLE `stats_todays` ( `dateDim` int(11) unsigned NOT NULL, `accountDim` mediumint(8) unsigned NOT NULL, `execCodeDim` smallint(5) unsigned NOT NULL, `operationTypeDim` tinyint(3) unsigned NOT NULL, `junkDim` tinyint(3) unsigned NOT NULL, `ipCountryDim` smallint(5) unsigned NOT NULL, `count` int(10) unsigned NOT NULL, `amount` bigint(20) NOT NULL, PRIMARY KEY (`dateDim`,`accountDim`,`execCodeDim`,`operationTypeDim`,`junkDim`,`ipCountryDim`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8
在查詢方面工作正常,您應該讓 dateDim 包含日期時間字元串的UNIX_TIMESTAMP()。您的查詢看起來更像這樣:
SELECT * FROM `stats` WHERE dateDim = UNIX_TIMESTAMP('2014-04-03 00:00:00') AND accountDim = 4 AND execCodeDim = 9 AND operationTypeDim = 1 AND junkDim = 5 AND ipCountryDim = 3