Partitioning

timescaledb:批量插入耗盡所有記憶體

  • August 23, 2021

概括

我正在嘗試將數據批量插入 timescaledb 超表中。無論我嘗試什麼,記憶體使用量都會逐漸增長,直到伺服器程序由於記憶體不足而被終止。我在伺服器上的數據集小至 160 萬行時觀察到這一點,其中 128 GB 的 RAM 可用於 postgres/timescaledb,所以我一定做錯了什麼。對不是超表的表執行完全相同的插入操作就可以了,因此問題必須與 timescaledb 有關。

我想要做什麼

我希望插入的表定義如下:

CREATE TABLE test (
   gid BIGINT NOT NULL,
   location_ GEOGRAPHY(POINT),
   -- about 15 other VARCHAR and BIGINT columns omitted for brevity
   far_end_gid BIGINT NOT NULL,
   day_partition_key DATE NOT NULL,
   PRIMARY KEY (gid, far_end_gid, day_partition_key)
);
SELECT create_hypertable(
   'test', 'day_partition_key', chunk_time_interval => INTERVAL '1 day');

要插入的數據在數據庫中的另一個表中(通過\COPY操作填充的臨時表);除了某些欄位需要一些解析(將字元串轉換為日期等)之外,此另一個表具有與目標超表相同的欄位。失敗的INSERT查詢是

INSERT INTO topology_test
   SELECT
       tobigintornull(gid) AS gid_as_int,
       -- about 15 other VARCHAR and BIGINT columns omitted
       CAST(CAST(REPLACE(far_end_gid, ',', '.') AS DOUBLE PRECISION) AS BIGINT),
       CAST(tobigintornull(far_end_cell_id) AS BIGINT),
       TO_DATE(day_partition_key, 'YYYYMMDD')
   FROM test_temp_table
   WHERE
       tobigintornull(gid) is not null
       and tobigintornull(REPLACE(far_end_gid, ',', '.')) is not null
       and day_partition_key is not null
   ON CONFLICT DO NOTHING;

ON CONFLICT部分旨在靜默刪除重複的主鍵,並且該函式tobigintornull執行其名稱所暗示的操作:如果可能,它將輸入轉換為 bigint,如果不是,則返回 null(這有助於刪除無法解析的行)。它的定義是

CREATE OR REPLACE FUNCTION tobigintornull(text) RETURNS BIGINT AS $$
DECLARE x BIGINT;
BEGIN
   x = $1::BIGINT;
   RETURN x;
EXCEPTION WHEN others THEN
   RETURN NULL;
END;
$$
STRICT
LANGUAGE plpgsql IMMUTABLE;

輸入的定義DOUBLE PRECISION類似。WHERE請注意,該子句僅刪除了一小部分行(絕對小於 1%) 。

完整的數據集是 5.93 億行,並且沒有以任何方式排序。懷疑輸入缺乏排序是問題的一部分,我創建了一個包含 160 萬行的數據子集,其中所有行的值都相同day_partition_key(從 timescaledb 的角度來看,該子集應該是完美排序的)。

問題

問題表現為 postgres 的記憶體使用量逐漸增加,直到數據庫可用的全部 128 GB 被使用(大約需要 5 分鐘)。在 100% 記憶體使用一分鐘左右後,插入查詢崩潰。日誌顯示以下內容:

timescaledb_1  | 2021-08-23 11:56:50.727 UTC [1] LOG:  server process (PID 1231) was terminated by signal 9: Killed
timescaledb_1  | 2021-08-23 11:56:50.727 UTC [1] DETAIL:  Failed process was running: INSERT INTO test
timescaledb_1  |            SELECT

(...)

timescaledb_1  | 2021-08-23 11:56:50.727 UTC [1] LOG:  terminating any other active server processes
timescaledb_1  | 2021-08-23 11:56:50.741 UTC [1215] WARNING:  terminating connection because of crash of another server process
timescaledb_1  | 2021-08-23 11:56:50.741 UTC [1215] DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
timescaledb_1  | 2021-08-23 11:56:50.741 UTC [1215] HINT:  In a moment you should be able to reconnect to the database and repeat your command.
timescaledb_1  | 2021-08-23 11:56:50.744 UTC [1221] WARNING:  terminating connection because of crash of another server process
timescaledb_1  | 2021-08-23 11:56:50.744 UTC [1221] DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
timescaledb_1  | 2021-08-23 11:56:50.744 UTC [1221] HINT:  In a moment you should be able to reconnect to the database and repeat your command.
timescaledb_1  | 2021-08-23 11:56:50.771 UTC [1] LOG:  all server processes terminated; reinitializing
timescaledb_1  | 2021-08-23 11:56:51.371 UTC [1259] LOG:  database system was interrupted; last known up at 2021-08-23 11:50:40 UTC
timescaledb_1  | 2021-08-23 11:56:51.711 UTC [1259] LOG:  database system was not properly shut down; automatic recovery in progress
timescaledb_1  | 2021-08-23 11:56:51.720 UTC [1259] LOG:  redo starts at 25A/16567EF8
timescaledb_1  | 2021-08-23 11:56:51.788 UTC [1259] LOG:  invalid record length at 25A/165C1A30: wanted 24, got 0
timescaledb_1  | 2021-08-23 11:56:51.788 UTC [1259] LOG:  redo done at 25A/165C19D0
timescaledb_1  | 2021-08-23 11:56:51.893 UTC [1] LOG:  database system is ready to accept connections
timescaledb_1  | 2021-08-23 11:56:52.039 UTC [1265] LOG:  TimescaleDB background worker launcher connected to shared catalogs

顯然,數據庫恢復後目標表中不存在任何行。

我試圖解決的問題

  • 我已將數據集的大小從 5.93 億行減少到 160 萬行,確保子集對於用於分塊的日期列只有一個值 ( day_partition_key)。結果是完全一樣的。
  • https://github.com/timescale/timescaledb/issues/643的討論之後,我將 timescaledb 配置更改為SET timescaledb.max_open_chunks_per_insert=1;. 問題仍然完全相同。
  • 我嘗試創建另一個目標表而不將其設為超表。然後插入 1.6 M 行子集就可以了。我希望整套都能奏效,但我還沒有花時間去做。

版本、硬體和配置

docker 鏡像 timescale/timescaledb-postgis:latest-pg13 (bf76e5594c98) 用於執行 timescaledb。它包含:

  • x86_64-pc-linux-musl 上的 PostgreSQL 13.3,由 gcc (Alpine 10.2.1_pre1) 10.2.1 20201203 編譯,64 位
  • 時標數據庫 2.3.0
  • POSTGIS=“2.5.5”$$ EXTENSION $$PGSQL=“130” GEOS=“3.8.1-CAPI-1.13.3” PROJ=“Rel. 7.1.1,2020 年 9 月 1 日” GDAL=“GDAL 3.1.4,2020/10/20 發布” LIBXML=" 2.9.10" LIBJSON=“0.15” LIBPROTOBUF=“1.3.3” 光柵

docker 容器限制為 16 個核心和 128 GB 記憶體。對於上面提到的測試,我使用了https://pgtune.leopard.in.ua/#/建議的配置參數用於帶有我的參數的數據倉庫,除了我將提供給 pgtune 的可用記憶體降低到 64 GB,希望這會導致數據庫使用更少記憶體的配置(我也嘗試了推薦用於 128 GB 記憶體的設置,結果是相同的)。設置如下:

max_connections = 30
shared_buffers = 16GB
effective_cache_size = 48GB
maintenance_work_mem = 2GB
checkpoint_completion_target = 0.9
wal_buffers = 16MB
default_statistics_target = 500
random_page_cost = 1.1
effective_io_concurrency = 200
work_mem = 34952kB
min_wal_size = 4GB
max_wal_size = 16GB
max_worker_processes = 16
max_parallel_workers_per_gather = 8
max_parallel_workers = 16
max_parallel_maintenance_workers = 4

偉大的工作,很好的解釋!

我認為您需要分析您的 Postgres 以更好地了解正在發生的事情。

只是想法:

  1. 刪除 plpgsql 函式並驗證您的字元串或使用coalesce來管理空值。

我還考慮使用公用表表達式在一個地方解決您的問題,並避免多次呼叫該函式。

  1. 如果只有規範化的數據重要,而去規範化的結果不重要,可以考慮在插入前增加一個預處理步驟來清理,避免使用 where 子句等。

引用自:https://dba.stackexchange.com/questions/298427