Postgresql

是否可以重寫此函式進行優化?

  • January 30, 2019

我需要幫助優化此功能,因為目前它需要超過 2 分鐘,這遠非最佳。此函式基本上為您提供列名、對應表和列中包含的唯一值列表。

功能定義:

CREATE OR REPLACE FUNCTION get_pubcol(
   included_tables name[] default '{}',
   included_schema name[] default '{}'
)
RETURNS table(schema_name text, table_name text, column_name text, column_value text)
AS $$
begin
 FOR schema_name, table_name, column_name IN
     SELECT c.table_schema, c.table_name, c.column_name
     FROM information_schema.columns c
     JOIN information_schema.tables t ON
       (t.table_name = c.table_name AND t.table_schema=c.table_schema)
     WHERE (c.table_name=ANY(included_tables) OR included_tables='{}')
       AND (c.table_schema=ANY(included_schema) OR included_schema='{}')
       AND t.table_type='BASE TABLE'
 LOOP
   EXECUTE format('SELECT ARRAY_AGG(DISTINCT %I) AS column_value FROM %I.%I',
      column_name,
      schema_name,
      table_name
   ) INTO column_value;
   IF column_value is not null THEN
     RETURN NEXT;
   END IF;
END LOOP;
END;
$$ language plpgsql;

函式呼叫:

SELECT * FROM get_pubcol('{}', '{public}')

解釋分析結果:

"Function Scan on get_pubcol  (cost=0.25..10.25 rows=1000 width=128) (actual time=121235.178..121235.199 rows=73 loops=1)"
"Planning time: 0.049 ms"
"Execution time: 121237.436 ms"

查詢結果:

schema_name table_name column_name   column_value
"public"    "MARA"  "Lab/Office"    "{" ",BL1,GL1,GP1,KL1,KP1,NaN,SL1,SP1,WL1,WL2,WP1}"

您正在為每個列和表執行一個查詢。優化這一點的一種方法是,只為每個表執行一個查詢,一次性檢索所有列的不同值。此結果可以放入 JSONB 中,然後可以使用以下命令作為單獨的行返回jsonb_each

CREATE OR REPLACE FUNCTION get_pubcol(
   included_tables name[] default '{}',
   included_schema name[] default '{}'
)
RETURNS table(schema_name text, table_name text, column_name text, column_value text)
AS 
$$
declare
 l_value jsonb;
 l_sql   text;
begin
 FOR schema_name, table_name, l_sql IN
   SELECT c.table_schema, c.table_name, 
          concat('select to_jsonb(t) from ( select ', 
                 string_agg(format('array_agg(distinct %I) as %I', c.column_name, c.column_name), ', '), 
                 format(' from %I.%I) as t', c.table_schema, c.table_name)) as sql
   FROM information_schema.columns c
     JOIN information_schema.tables t
       ON t.table_name = c.table_name AND t.table_schema=c.table_schema
   WHERE t.table_type = 'BASE TABLE'
     AND (c.table_name = ANY(included_tables) OR included_tables='{}')
     AND (c.table_schema = ANY(included_schema) OR included_schema='{}')
   group by c.table_schema, c.table_name
 LOOP
   EXECUTE l_sql into l_value;
   IF l_value is not null THEN
     return query 
       select schema_name, table_name, x.*
       from jsonb_each_text(l_value) as x(column_name, column_value);
   END IF;
END LOOP;
END;
$$ language plpgsql;

外部查詢為每個表組裝一個查詢。例如,如果您有一個person包含 , 列的表id,則循環查詢firstnamelastname的表達式將生成如下內容:

select to_jsonb(t) 
from ( 
  select array_agg(distinct lastname) as lastname, array_agg(distinct id) as id, array_agg(distinct firstname) as firstname 
from public.person
) as t

該查詢返回一個包含所有列資訊的 jsonb 值。這儲存在局部變數l_value中,例如:

{
   "id": [42,43,44,45],
   "lastname": ["Beeblebrox","Dent","McMillan","Prefect"],
   "firstname": ["Arthur","Ford","Tricia","Zaphod"]
}

然後使用jsonb_each_text.

但是顯示會有所不同。

你的例子

{BL1,GL1,GP1,KL1,KP1,NaN,SL1,SP1,WL1,WL2,WP1}

將顯示為

["BL1","GL1","GP1","KL1","KP1","NaN","SL1","SP1","WL1","WL2","WP1"]

引用自:https://dba.stackexchange.com/questions/228406