Postgresql

PostgreSQL RDS:當大表/索引不在記憶體中時,性能比預期慢

  • October 25, 2021

我有一個相當大的表,43GB 和 14GB 索引,其中包含時間序列成本數據。我正在按日期查詢並彙總金額。當此數據不在記憶體中(作業系統或 Postgres)時,對於在特定時間段內擁有數百萬行數據的使用者,查詢可能需要長達 50 秒的時間才能執行,但通常會被其他過濾器過濾到數千行應用。據我所知,索引已根據查詢模式進行了很好的優化。我已經執行 EXPLAIN (ANALYZE, BUFFERS) 並且可以清楚地看到從磁碟讀取速度變慢了。

我的工作量有點奇怪,因為我正在做大批量寫入和大批量刪除,所以我相信 VACUUM 工作做了很多工作。我根本沒有對此進行調整,但實際上我認為根據我使用的索引這不會有幫助。我正在跟踪“活動進口”,然後刪除舊的非活動進口。活動導入 ID 包含在查詢和索引中,因此我不應該掃描以前導入的死元組。

我還嘗試將 random_page_cost 調低到 1.1,它將使用索引掃描而不是點陣圖堆掃描,但性能最終大致相同。

我在 RDS db.r6g.2xlarge(8vCPU,64GB 的 RAM)上執行 Postgres 13.4,並預配了 IOPS(11,000 - 我通常最多讀取 + 寫入 9,000)。

我不希望非記憶體查詢這麼慢。我在這裡有錯誤的期望嗎?我已經將 shared_buffers 調整為 40%,並意識到根據表和索引的大小,我可能應該跳到 128G 版本,這將是我的下一步。

                                                       Table 

"public.service_costs"
        Column          |              Type              | Collation | Nullable |         Default          | Storage  | Stats target | Description
-------------------------+--------------------------------+-----------+----------+--------------------------+----------+--------------+-------------
id                      | uuid                           |           | not null | public.gen_random_uuid() | plain    |              |
date                    | timestamp without time zone    |           | not null |                          | plain    |              |
cost_type               | character varying              |           | not null |                          | extended |              |
service                 | character varying              |           | not null |                          | extended |              |
amount                  | numeric                        |           | not null |                          | main     |              |
cost_category           | character varying              |           |          |                          | extended |              |
cost_sub_category       | character varying              |           |          |                          | extended |              |
service_costs_import_id | bigint                         |           | not null |                          | plain    |              |

Indexes:
   "service_costs_pkey" PRIMARY KEY, btree (id)
   "indx_srvc_csts_on_cst_type__dt__srvc__cst_ctgry__cst_sb_ctgry" btree (service_costs_import_id, cost_type, date, service, cost_category, cost_sub_category)
Access method: heap

   schema_name     |                             relname                             |    size    | table_size
--------------------+-----------------------------------------------------------------+------------+-------------
public             | service_costs                                                   | 43 GB      | 46511259648
public             | indx_srvc_csts_on_cst_type__dt__srvc__cst_ctgry__cst_sb_ctgry   | 14 GB      | 15080833024

EXPLAIN (ANALYZE,BUFFERS) SELECT SUM ( service_costs . amount )
FROM service_costs WHERE service_costs . cost_type IN (...)
AND service_costs . service_costs_import_id IN (2066, 2067, 1267, 1269, 1268, 1270, 2068, 1273, 4996, 5047)
AND service_costs.service = '....'
AND "service_costs"."date" BETWEEN '2021-10-01' AND '2021-10-31 23:59:59.999999';

Aggregate  (cost=1390974.93..1390974.94 rows=1 width=32) (actual time=17067.830..17067.831 rows=1 loops=1)
  Buffers: shared hit=6854 read=80448 dirtied=754
  I/O Timings: read=16236.006
  ->  Bitmap Heap Scan on service_costs  (cost=351286.12..1390173.71 rows=320487 width=5) (actual time=4827.074..16996.060 rows=323382 loops=1)
        Recheck Cond: ((service_costs_import_id = ANY ('{2066,2067,1267,1269,1268,1270,2068,1273,4996,5047}'::bigint[])) AND ((cost_type)::text = ANY ('{...}'::text[])) AND (date >= '2021-10-01 00:00:00'::timestamp without time zone) AND (date <= '2021-10-31 23:59:59.999999'::timestamp without time zone) AND ((service)::text = '...'::text))
        Heap Blocks: exact=70327
        Buffers: shared hit=6854 read=80448 dirtied=754
        I/O Timings: read=16236.006
        ->  Bitmap Index Scan on indx_srvc_csts_on_cst_type__dt__srvc__cst_ctgry__cst_sb_ctgry  (cost=0.00..351206.00 rows=320487 width=0) (actual time=4815.759..4815.759 rows=323382 loops=1)
              Index Cond: ((service_costs_import_id = ANY ('{2066,2067,1267,1269,1268,1270,2068,1273,4996,5047}'::bigint[])) AND ((cost_type)::text = ANY ('{...}'::text[])) AND (date >= '2021-10-01 00:00:00'::timestamp without time zone) AND (date <= '2021-10-31 23:59:59.999999'::timestamp without time zone) AND ((service)::text = '...'::text))
              Buffers: shared hit=159 read=16816
              I/O Timings: read=4575.310
Planning Time: 0.159 ms
Execution Time: 17067.865 ms
(14 rows)

Aggregate  (cost=1390974.93..1390974.94 rows=1 width=32) (actual time=403.002..403.003 rows=1 loops=1)
  Buffers: shared hit=87302
  ->  Bitmap Heap Scan on service_costs  (cost=351286.12..1390173.71 rows=320487 width=5) (actual time=206.128..338.491 rows=323382 loops=1)
        Recheck Cond: ((service_costs_import_id = ANY ('{2066,2067,1267,1269,1268,1270,2068,1273,4996,5047}'::bigint[])) AND ((cost_type)::text = ANY ('{....}'::text[])) AND (date >= '2021-10-01 00:00:00'::timestamp without time zone) AND (date <= '2021-10-31 23:59:59.999999'::timestamp without time zone) AND ((service)::text = '...'::text))
        Heap Blocks: exact=70327
        Buffers: shared hit=87302
        ->  Bitmap Index Scan on indx_srvc_csts_on_cst_type__dt__srvc__cst_ctgry__cst_sb_ctgry  (cost=0.00..351206.00 rows=320487 width=0) (actual time=195.167..195.167 rows=323382 loops=1)
              Index Cond: ((service_costs_import_id = ANY ('{...}'::bigint[])) AND ((cost_type)::text = ANY ('{...}'::text[])) AND (date >= '2021-10-01 00:00:00'::timestamp without time zone) AND (date <= '2021-10-31 23:59:59.999999'::timestamp without time zone) AND ((service)::text = '....'::text))
              Buffers: shared hit=16975
Planning Time: 0.168 ms
Execution Time: 403.042 ms

(11 行)

Index Cond: ((service_costs_import_id = ANY ('{2066,2067,1267,1269,1268,1270,2068,1273,4996,5047}'::bigint[])) AND ((cost_type)::text = ANY ('{...}'::text[])) AND (date >= '2021-10-01 00:00:00'::timestamp without time zone) AND (date <= '2021-10-31 23:59:59.999999'::timestamp without time zone) AND ((service)::text = '...'::text))

看起來您的索引是 on (service_costs_import_id, cost_type, date, service),儘管它可能具有比查詢中未使用的列更多的列。如果這些是按順序排列的列,那麼該索引的問題是“服務”無法有效使用,因為它遵循“日期”列,該列用於範圍而不是相等。所以“服務”只能用於過濾掉行,不能跳轉到索引中的特定位置。如果您顛倒索引中最後兩列的順序,那麼它將能夠有效地利用所有列。

更好的是,如果您顛倒順序然後將“數量”添加到末尾,您可以獲得僅索引掃描。但有效地做到這一點可能需要更高水平的吸塵。

在配置 11,000 IOPS 的情況下,讀取 80000 個塊不應超過 16000 毫秒。但這忽略了延遲,除了 AWS 文件中最模糊的術語外,我沒有發現任何描述。如果您需要在發送下一個請求之前等待一個塊返回,您將不可能獲得最高的可用 IOPS。您可以提高 Effective_io_concurrency 以查看是否可以通過一次處理多個未完成的請求來改善情況。(它不會改進點陣圖索引掃描部分,只是點陣圖堆掃描。)當然,這種分析取決於一次只有一個查詢。如果多個查詢必須共享吞吐量,則必須對其進行劃分。

引用自:https://dba.stackexchange.com/questions/301637