Mysql

是否可以創建一個 MySQL 分區,然後通過另一列創建一個子分區?

  • July 26, 2022

我閱讀了有關子分區的文件,但我認為它沒有涵蓋我想要實現的目標。 https://dev.mysql.com/doc/refman/5.7/en/partitioning-subpartitions.html

我們有這張表:

CREATE TABLE `posts_countries` (
 `customer_id` int unsigned NOT NULL,
 `partition_date` mediumint unsigned DEFAULT '0',
 `post_id` int unsigned NOT NULL,
 `country` varchar(6) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
 `impressions` int NOT NULL,
 `engagement` int DEFAULT NULL,
 PRIMARY KEY (`customer_id`,`post_id`,`country`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;

我們有 MILLON 行,每個查詢的性能都很慢,所以我們想用 MySQL 創建 PARTITION。一些測試提高了性能,因為我們總是按 customer_id 和 post_id 查詢。

我現在想要實現的是通過 customer_id 進行 PARTITION,然後通過 partition_date。例如分區日期,將 202207 儲存為 2022 年 7 月,依此類推。

我們只想按客戶創建 13 個分區(他的客戶很少)。因此,分區 1 將是:

Customer 1 -> VALUES LESS THAN MAX
Customer 1 -> VALUES LESS THAN 202207
Customer 1 -> VALUES LESS THAN 202206
...
Customer 1 -> VALUES LESS THAN 202107
...
...

Customer 2 -> VALUES LESS THAN MAX
Customer 2 -> VALUES LESS THAN 202207
Customer 2 -> VALUES LESS THAN 202206
...
Customer 2 -> VALUES LESS THAN 202107

我嘗試以這種方式進行分區,但不是按 client_id 進行分區

ALTER TABLE posts_countries
   PARTITION BY RANGE (partition_date)
SUBPARTITION BY HASH(customer_id)
SUBPARTITIONS 10
(
PARTITION 0c1 VALUES LESS THAN (202108) ENGINE = InnoDB,
PARTITION 0c2 VALUES LESS THAN (202109) ENGINE = InnoDB,
PARTITION 0c3 VALUES LESS THAN (202110) ENGINE = InnoDB,
PARTITION 0c4 VALUES LESS THAN (202111) ENGINE = InnoDB,
PARTITION 0c5 VALUES LESS THAN (202112) ENGINE = InnoDB,
PARTITION 0c6 VALUES LESS THAN (202201) ENGINE = InnoDB,
PARTITION 0c7 VALUES LESS THAN (202202) ENGINE = InnoDB,
PARTITION 0c8 VALUES LESS THAN (202203) ENGINE = InnoDB,
PARTITION 0c9 VALUES LESS THAN (202204) ENGINE = InnoDB,
PARTITION 0c10 VALUES LESS THAN (202205) ENGINE = InnoDB,
PARTITION 0c11 VALUES LESS THAN (202206) ENGINE = InnoDB,
PARTITION 0c12 VALUES LESS THAN (202207) ENGINE = InnoDB,
PARTITION 0c13 VALUES LESS THAN (MAXVALUE) ENGINE = InnoDB
);

這種分區結構可能嗎?謝謝指教。

沒有什麼能阻止。

CREATE TABLE test (id INT, grp INT, val INT)
    PARTITION BY RANGE( grp )
    SUBPARTITION BY HASH( val )
    SUBPARTITIONS 2 (
        PARTITION p0 VALUES LESS THAN (10),
        PARTITION p1 VALUES LESS THAN MAXVALUE
    );
INSERT INTO test VALUES
(1,1,1),
(2,10,2),
(3,20,3),
(4,50,4),
(5,80,5),
(6,2000,6);
SELECT TABLE_NAME,
       PARTITION_NAME,
       SUBPARTITION_NAME,
       TABLE_ROWS
FROM INFORMATION_SCHEMA.PARTITIONS
WHERE TABLE_SCHEMA = DATABASE();

db<>在這裡擺弄

SUBPARTITION BY KEY( val )也可用。

分區不是性能的靈丹妙藥。在您檢查(問我)它是否會加快您的查詢之前,不要嘗試添加分區。

您可能需要分區的主要情況:

  • 您將清除舊數據並希望使用DROP PARTITION而不是DELETE.
  • 查詢是“二維”的,兩者都有範圍——如緯度和經度。

向我們展示您的主要查詢。聞起來像

SELECT ...
   FROM ...
   WHERE customer_id = 4
     AND some_date BETWEEN ...

好吧,這個“複合”索引非常適合非分區表:

INDEX(customer_id, some_date)

(這不是真正的二維,因為您將始終(?)=在測試時使用customer_id。)

更多討論: 分區

引用自:https://dba.stackexchange.com/questions/314855