Postgresql

慢查詢/索引創建(PostgreSQL 9.2)

  • May 3, 2016

我有以下查詢:

explain analyze
SELECT split_part(full_path, '/', 4)::INT AS account_id,
      split_part(full_path, '/', 6)::INT AS note_id,
      split_part(full_path, '/', 9)::TEXT AS variation,
      st_size,
      segment_index,
      reverse(split_part(reverse(full_path), '/', 1)) as file_name,
      i.st_ino,
      full_path,
      (i.st_size / 1000000::FLOAT)::NUMERIC(5,2) || 'MB' AS size_mb
FROM gorfs.inodes i
JOIN gorfs.inode_segments s
 ON i.st_ino = s.st_ino_target
WHERE
     i.checksum_md5 IS NOT NULL
 AND s.full_path ~ '^/userfiles/account/[0-9]+/[a-z]+/[0-9]+'
 AND i.st_size > 0;
 split_part(s.full_path, '/', 4)::INT IN (

SELECT account.id
       FROM public.ja_clients AS account
       WHERE
       NOT (
               ((account.last_sub_pay > EXTRACT('epoch' FROM (transaction_timestamp() - CAST('4 Months' AS INTERVAL)))) AND (account.price_model > 0)) OR
               (account.regdate > EXTRACT('epoch' FROM (transaction_timestamp() - CAST('3 Month' AS INTERVAL)))) OR
               (((account.price_model = 0) AND (account.jobcredits > 0)) AND (account.last_login > EXTRACT('epoch' FROM (transaction_timestamp() - CAST('4 Month' AS INTERVAL)))))
       ) LIMIT 100
);

查詢需要很長時間,我無法解決問題。

這些是我已經在 inode_segments 表上創建的索引:

Indexes:
   "ix_account_id_from_full_path" "btree" (("split_part"("full_path"::"text", '/'::"text", 4)::integer)) WHERE "full_path"::"text" ~ '^/userfiles/account/[0-9]+/[a-z]+/[0-9]+'::"text"
   "ix_inode_segments_ja_files_lookup" "btree" ((
CASE
   WHEN "full_path"::"text" ~ '/[^/]*\.[^/]*$'::"text" THEN "upper"("regexp_replace"("full_path"::"text", '.*\.'::"text", ''::"text", 'g'::"text"))
   ELSE NULL::"text"
END)) WHERE "gorfs"."is_kaminski_note_path"("full_path"::"text")
   "ix_inode_segments_notes_clientids" "btree" (("split_part"("full_path"::"text", '/'::"text", 4)::integer)) WHERE "gorfs"."is_kaminski_note_path"("full_path"::"text")
   "ix_inode_segments_notes_clientids2" "btree" ("full_path")
   "ix_inode_segments_notes_fileids" "btree" (("split_part"("full_path"::"text", '/'::"text", 8)::integer)) WHERE "gorfs"."is_kaminski_note_path"("full_path"::"text")
   "ix_inode_segments_notes_noteids" "btree" ((NULLIF("split_part"("full_path"::"text", '/'::"text", 6), 'unassigned'::"text")::integer)) WHERE "gorfs"."is_kaminski_note_path"("full_path"::"text")

這些是我已經在 inode 表上創建的索引:

Indexes:
   "ix_inodes_checksum_st_size" "btree" ("checksum_md5", "st_size") WHERE "checksum_md5" IS NOT NULL

題:

我還能做些什麼來提高查詢的性能?

更新 1:

解釋分析:http ://explain.depesz.com/s/UBr

索引和函式已創建,如下面的答案所述。

更新 2:

解釋分析:http ://explain.depesz.com/s/LHS

使用以下答案中提供的查詢

也許這會有所幫助。

如果您經常依賴 full_path 中的 account_id,那麼您將從函式和函式索引中受益:

CREATE OR REPLACE FUNCTION gorfs.f_get_account_from_full_path(p_full_path text) RETURNS int AS $body$
SELECT (regexp_matches($1, '^/userfiles/account/([0-9]+)/[a-z]+/[0-9]+'))[1]::int
$body$ LANGUAGE SQL IMMUTABLE SECURITY DEFINER RETURNS NULL ON NULL INPUT;

CREATE INDEX ON gorfs.inode_segments (gorfs.f_get_account_from_full_path(full_path));

確保在!gorfs.inodes上有一個索引(或更好的鍵,如果適用)st_ino

split_part您為每一行多次執行該函式,這可能會造成重大損失。我已將其替換為 string_to_array,然後根據需要獲取各個部分。我也不明白你打算field_name使用反向獲得什麼?下面的查詢返回它的最後一個元素。

您的查詢返回數百萬行。即使 PostgreSQL 相當快地處理查詢,您的客戶端應用程序(尤其是如果您使用 PgAdminIII)將難以分配足夠的記憶體並接收和格式化結果,並且可能是最耗時的。因此,您可能希望使用結果創建一個臨時表,然後針對該臨時表進行查詢:

CREATE TEMP TABLE myresults AS
WITH
 accounts AS (
   SELECT id
   FROM public.ja_clients
   WHERE NOT (
              (last_sub_pay > EXTRACT('epoch' FROM now() - '4 Months'::INTERVAL) AND price_model > 0) OR
              regdate > EXTRACT('epoch' FROM now() - '3 Month'::INTERVAL) OR
              (price_model = 0 AND jobcredits > 0 AND last_login > EXTRACT('epoch' FROM now() - '4 Month'::INTERVAL))
             )
   ORDER BY 1 LIMIT 100 -- first 100 accounts for testing purposes; comment out this line once the query is proven performant enough
   ) 
SELECT r.parts[4]::INT AS account_id, r.parts[6]::INT AS note_id, r.parts[9] AS variation,
      st_size, segment_index, r.parts[array_upper(r.parts, 1)] AS file_name, st_ino, full_path, size_mb
FROM (
 SELECT string_to_array(full_path, '/') AS parts, st_size, segment_index, i.st_ino, full_path,
        (i.st_size / 1000000::FLOAT)::NUMERIC(5,2) || 'MB' AS size_mb
 FROM gorfs.inode_segments s
 JOIN gorfs.inodes i ON (i.st_ino = s.st_ino_target)
 WHERE gorfs.f_get_account_from_full_path(s.full_path) IN (SELECT * FROM accounts)
   AND i.checksum_md5 IS NOT NULL
   AND i.st_size > 0
 ) r;

SELECT *
FROM myresults
LIMIT 100;

引用自:https://dba.stackexchange.com/questions/137318