PostgreSQL 索引記憶體

August 25, 2021

我很難找到關於如何在 PostgreSQL 中記憶體索引的“普通”解釋，所以我想對這些假設中的任何一個或所有假設進行現實檢查：
PostgreSQL 索引，就像行一樣，存在於磁碟上，但可能會被記憶體。
索引可能完全在記憶體中，也可能根本不在。
它是否被記憶體取決於它的使用頻率（由查詢計劃器定義）。
出於這個原因，大多數“明智的”索引將一直在記憶體中。
索引與buffer cache行位於相同的記憶體（？）中，因此索引使用的記憶體空間對行不可用。
我理解這一點的動機來自另一個問題，我問過在哪裡建議可以在大多數數據永遠不會被訪問的表上使用
*部分索引。*在進行此操作之前，我想明確一點，使用部分索引會產生兩個優點：
我們減少了記憶體中索引的大小，為記憶體中的行本身釋放了更多空間。
我們減小了 B-Tree 的大小，從而加快了查詢響應。

玩了一下pg_buffercache，我可以回答你的一些問題。

這很明顯，但**(5)的結果也表明答案是肯定的**
我還沒有為此建立一個很好的例子，現在它比沒有更多：）（請參閱下面的編輯，答案是否定的。）
由於計劃者決定是否使用索引，我們可以說YES，它決定記憶體（但這更複雜）
記憶體的確切細節可以從原始碼中獲得，我在這個主題上找不到太多，除了這個（也見作者的回答）。但是，我很確定這又比簡單的“是”或“否”要復雜得多。（同樣，從我的編輯中你可以得到一些想法——由於記憶體大小是有限的，那些“合理”的索引會競爭可用空間。如果它們太多，它們會從記憶體中互相踢掉——所以答案是NO。 )
作為一個簡單的pg_buffercache節目查詢，答案是肯定的YES。值得注意的是，臨時表數據不會在這裡記憶體。

編輯

我找到了 Jeremiah Peschka關於表和索引儲存的精彩文章。有了那裡的資訊，我也可以回答**（2）**。我設置了一個小測試，所以你可以自己檢查這些。

-- we will need two extensions
CREATE EXTENSION pg_buffercache;
CREATE EXTENSION pageinspect;


-- a very simple test table
CREATE TABLE index_cache_test (
     id serial
   , blah text
);


-- I am a bit megalomaniac here, but I will use this for other purposes as well
INSERT INTO index_cache_test
SELECT i, i::text || 'a'
FROM generate_series(1, 1000000) a(i);


-- let's create the index to be cached
CREATE INDEX idx_cache_test ON index_cache_test (id);


-- now we can have a look at what is cached
SELECT c.relname,count(*) AS buffers
FROM 
   pg_class c 
   INNER JOIN pg_buffercache b ON b.relfilenode = c.relfilenode 
   INNER JOIN pg_database d ON (b.reldatabase = d.oid AND d.datname = current_database())
GROUP BY c.relname
ORDER BY 2 DESC LIMIT 10;

            relname              | buffers
----------------------------------+---------
index_cache_test                 |    2747
pg_statistic_relid_att_inh_index |       4
pg_operator_oprname_l_r_n_index  |       4
... (others are all pg_something, which are not interesting now)

-- this shows that the whole table is cached and our index is not in use yet

-- now we can check which row is where in our index
-- in the ctid column, the first number shows the page, so 
-- all rows starting with the same number are stored in the same page
SELECT * FROM bt_page_items('idx_cache_test', 1);

itemoffset |  ctid   | itemlen | nulls | vars |          data
------------+---------+---------+-------+------+-------------------------
         1 | (1,164) |      16 | f     | f    | 6f 01 00 00 00 00 00 00
         2 | (0,1)   |      16 | f     | f    | 01 00 00 00 00 00 00 00
         3 | (0,2)   |      16 | f     | f    | 02 00 00 00 00 00 00 00
         4 | (0,3)   |      16 | f     | f    | 03 00 00 00 00 00 00 00
         5 | (0,4)   |      16 | f     | f    | 04 00 00 00 00 00 00 00
         6 | (0,5)   |      16 | f     | f    | 05 00 00 00 00 00 00 00
...
        64 | (0,63)  |      16 | f     | f    | 3f 00 00 00 00 00 00 00
        65 | (0,64)  |      16 | f     | f    | 40 00 00 00 00 00 00 00

-- with the information obtained, we can write a query which is supposed to
-- touch only a single page of the index
EXPLAIN (ANALYZE, BUFFERS) 
   SELECT id 
   FROM index_cache_test 
   WHERE id BETWEEN 10 AND 20 ORDER BY id
;

Index Scan using idx_test_cache on index_cache_test  (cost=0.00..8.54 rows=9 width=4) (actual time=0.031..0.042 rows=11 loops=1)
  Index Cond: ((id &gt;= 10) AND (id &lt;= 20))
  Buffers: shared hit=4
Total runtime: 0.094 ms
(4 rows)

-- let's have a look at the cache again (the query remains the same as above)
            relname              | buffers
----------------------------------+---------
index_cache_test                 |    2747
idx_test_cache                   |       4
...

-- and compare it to a bigger index scan:
EXPLAIN (ANALYZE, BUFFERS) 
SELECT id 
   FROM index_cache_test 
   WHERE id &lt;= 20000 ORDER BY id
;


Index Scan using idx_test_cache on index_cache_test  (cost=0.00..666.43 rows=19490 width=4) (actual time=0.072..19.921 rows=20000 loops=1)
  Index Cond: (id &lt;= 20000)
  Buffers: shared hit=4 read=162
Total runtime: 24.967 ms
(4 rows)

-- this already shows that something was in the cache and further pages were read from disk
-- but to be sure, a final glance at cache contents:

            relname              | buffers
----------------------------------+---------
index_cache_test                 |    2691
idx_test_cache                   |      58

-- note that some of the table pages are disappeared
-- but, more importantly, a bigger part of our index is now cached

總而言之，這表明索引和表可以逐頁記憶體，因此**（2）的答案是否定**的。

最後一個來說明臨時表在此處未記憶體：

CREATE TEMPORARY TABLE tmp_cache_test AS 
SELECT * FROM index_cache_test ORDER BY id FETCH FIRST 20000 ROWS ONLY;

EXPLAIN (ANALYZE, BUFFERS) SELECT id FROM tmp_cache_test ORDER BY id;

-- checking the buffer cache now shows no sign of the temp table

引用自：https://dba.stackexchange.com/questions/25513

PostgreSQL 索引記憶體

編輯

相關問答

使用 GIN 索引位串

複合索引是否也適用於第一個欄位的查詢？

為什麼 ASC 比 DESC 快 100 倍，我該怎麼辦？

Postgres中索引列的查詢非常慢

如何加快查詢時間序列中的最後一個值？

為什麼這個帶有 WHERE、ORDER BY 和 LIMIT 的查詢這麼慢？