Mysql

從不使用以 DATETIME 作為複合鍵第一部分的主鍵索引

  • September 4, 2017

我在將日期時間(甚至日期)作為主鍵的第一部分進行索引時遇到問題。

我使用 MySQL 5.5

這是我的兩張表:

-- This is my standard table with dateDim as a dateTime

CREATE TABLE `stats` (
`dateDim` datetime NOT NULL,
`accountDim` mediumint(8) unsigned NOT NULL,
`execCodeDim` smallint(5) unsigned NOT NULL,
`operationTypeDim` tinyint(3) unsigned NOT NULL,
`junkDim` tinyint(3) unsigned NOT NULL,
`ipCountryDim` smallint(5) unsigned NOT NULL,
`count` int(10) unsigned NOT NULL,
`amount` bigint(20) NOT NULL,
PRIMARY KEY (`dateDim`,`accountDim`,`execCodeDim`,`operationTypeDim`,`junkDim`,`ipCountryDim`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8


-- Here is a copy with datDim as an integer

CREATE TABLE `stats_todays` (
`dateDim` int(11) unsigned NOT NULL,
`accountDim` mediumint(8) unsigned NOT NULL,
`execCodeDim` smallint(5) unsigned NOT NULL,
`operationTypeDim` tinyint(3) unsigned NOT NULL,
`junkDim` tinyint(3) unsigned NOT NULL,
`ipCountryDim` smallint(5) unsigned NOT NULL,
`count` int(10) unsigned NOT NULL,
`amount` bigint(20) NOT NULL,
PRIMARY KEY (`dateDim`,`accountDim`,`execCodeDim`,`operationTypeDim`,`junkDim`,`ipCountryDim`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

我用完全相同的數據填充兩個表(接近 10 000 000)

但:

  • stats 表對 dateDim 使用 DATETIME
  • stats_todays 使用 un INTEGER 和 TO_DAYS() 作為 dateDim

我的問題是:當索引的第一部分是日期時間時,為什麼 MySQL 不使用主鍵???這很奇怪,因為使用相同的數據但與 INTEGER 和 TO_DAYS(dateDim) 合併後,相同的請求會動搖……

統計表(和日期時間)的範例:

SELECT * 
FROM `stats`  
WHERE 
  dateDim = '2014-04-03 00:00:00' 
  AND accountDim = 4
  AND execCodeDim = 9
  AND operationTypeDim = 1
  AND junkDim = 5
  AND ipCountryDim = 3

=> 1 result (4.5sec)

Explain:

id  select_type     table   type    possible_keys   key     key_len     ref     rows           Extra
1   SIMPLE          stats   ALL           NULL     NULL       NULL      NULL    8832329     Using where

對其他表 stats_todays 的相同請求(使用 INTEGER 和 TO_DAYS() )

EXPLAIN SELECT * 
FROM `stats_todays`  
WHERE 
  dateDim = TO_DAYS('2014-04-03 00:00:00')
  AND accountDim = 4
  AND execCodeDim = 9
  AND operationTypeDim = 1
  AND junkDim = 5
  AND ipCountryDim = 3

=> Result 1 row (0.0003 sec) 

Explain:

id  select_type     table          type     possible_keys   key     key_len     ref                               rows  Extra
1   SIMPLE         stats_todays     const   PRIMARY     PRIMARY     13  const,const,const,const,const,const     1    

如果您閱讀了完整的文章,您就會明白這不是一個低基數問題,因為請求使用與 INTEGER dateDim 欄位完全相同的基數…。

以下是一些高級細節:

SELECT COUNT( DISTINCT dateDim )
FROM stats_todays
UNION ALL
SELECT COUNT( DISTINCT dateDim )
FROM stats;

Result:


COUNT(DISTINCT dateDim)
2192
2192

這是索引說明:

SHOW INDEXES FROM `stats` 

Table   Non_unique  Key_name    Seq_in_index    Column_name     Collation   Cardinality     Sub_part    Packed  Null    Index_type  Comment     Index_comment
stats   0            PRIMARY          1         dateDim           A     6921           NULL                 NULL        BTREE        
stats   0            PRIMARY          2         accountDim        A     883232         NULL                 NULL        BTREE        
stats   0            PRIMARY          3         execCodeDim       A     8832329     NULL                NULL        BTREE        
stats   0            PRIMARY          4         operationTypeDim  A     8832329     NULL                NULL        BTREE        
stats   0            PRIMARY          5         junkDim           A     8832329     NULL                NULL        BTREE        
stats   0            PRIMARY          6         ipCountryDim      A     8832329     NULL                NULL        BTREE       

SHOW INDEXES FROM `stats_todays` 

Table   Non_unique  Key_name    Seq_in_index    Column_name     Collation   Cardinality     Sub_part    Packed  Null    Index_type  Comment     Index_comment
stats_todays    0   PRIMARY     1              dateDim              A        7518   NULL                   NULL         BTREE        
stats_todays    0   PRIMARY     2              accountDim           A        4022582    NULL                   NULL         BTREE        
stats_todays    0   PRIMARY     3              execCodeDim          A        8045164    NULL                   NULL         BTREE        
stats_todays    0   PRIMARY     4              operationTypeDim     A        8045164    NULL                   NULL         BTREE        
stats_todays    0   PRIMARY     5              junkDim              A        8045164    NULL                   NULL         BTREE        
stats_todays    0   PRIMARY     6              ipCountryDim         A        8045164    NULL                   NULL         BTREE        

SELECT dateDim, COUNT(*) FROM stats GROUP BY dateDim WITH ROLLUP

  • 告訴有 2192 個不同的日期,並且重新分區是平滑的(按日期大約 3000 - 4000 行)
  • 表中有 8 831 990 行
  • 另一張桌子也一樣
  • 我嘗試使用 COVERING INDEX(將 * 替換為所有 PK 列)=> 沒有任何改變
  • 我試過 force|use index => 沒有任何改變
  • 與日期欄位相同,而不是日期時間
  • 與 INDEX 或 UNIQUE 相同,而不是主鍵

這是 5.5.x 中的一個錯誤。看這裡

這表明您的查詢應該是

SELECT * 
FROM `stats`  
WHERE 
  dateDim = CAST('2014-04-03 00:00:00' as datetime)
  AND accountDim = 4
  AND execCodeDim = 9
  AND operationTypeDim = 1
  AND junkDim = 5
  AND ipCountryDim = 3

自 int 版本的表

CREATE TABLE `stats_todays` ( 
`dateDim` int(11) unsigned NOT NULL, 
`accountDim` mediumint(8) unsigned NOT NULL, 
`execCodeDim` smallint(5) unsigned NOT NULL, 
`operationTypeDim` tinyint(3) unsigned NOT NULL, 
`junkDim` tinyint(3) unsigned NOT NULL, 
`ipCountryDim` smallint(5) unsigned NOT NULL, 
`count` int(10) unsigned NOT NULL, 
`amount` bigint(20) NOT NULL, 
PRIMARY KEY (`dateDim`,`accountDim`,`execCodeDim`,`operationTypeDim`,`junkDim`,`ipCountryDim`) 
) ENGINE=InnoDB DEFAULT CHARSET=utf8 

在查詢方面工作正常,您應該讓 dateDim 包含日期時間字元串的UNIX_TIMESTAMP()。您的查詢看起來更像這樣:

SELECT *        
FROM `stats`         
WHERE        
  dateDim = UNIX_TIMESTAMP('2014-04-03 00:00:00')
  AND accountDim = 4       
  AND execCodeDim = 9       
  AND operationTypeDim = 1       
  AND junkDim = 5       
  AND ipCountryDim = 3       

引用自:https://dba.stackexchange.com/questions/10340