對於每個類別,使用 PostgreSQL 遞歸 CTE 查找所有子類別中外鍵項的計數
我在 PostgreSQL 9.4 中有一個典型的樹結構儲存為鄰接列表:
gear_category ( id INTEGER PRIMARY KEY, name TEXT, parent_id INTEGER );
以及附加到類別的項目列表:
gear_item ( id INTEGER PRIMARY KEY, name TEXT, category_id INTEGER REFERENCES gear_category );
任何類別都可以附加裝備物品,而不僅僅是樹葉。
出於速度原因,我想預先計算每個類別的一些數據,我將使用這些數據來生成物化視圖。
期望的輸出:
speedy_materialized_view ( gear_category_id INTEGER, count_direct_child_items INTEGER, count_recursive_child_items INTEGER );
count_recursive_child_items
是附加到目前類別或任何子類別的 GearItem 的累積數量。每個類別應該有一行,任何為 0 的計數都應為 0。為了計算這個,我們需要使用遞歸 CTE 來遍歷樹:
WITH RECURSIVE children(id, parent_id) AS ( --base case SELECT gear_category.id AS id, gear_category.parent_id AS parent_id FROM gear_category WHERE gear_category.id = 37 -- setting this to id includes current object -- setting to parent_id includes only children --combine with recursive part UNION ALL SELECT gear_category.id AS gear_category_id , gear_category.parent_id AS gear_category_parent_id FROM gear_category, children WHERE children.id = gear_category.parent_id ) TABLE children;
計算附加到此子類別列表的子齒輪項目很簡單:
--Subselect variant SELECT count(gear_item.id) AS count_recursive_child_items_for_single_cat FROM gear_item WHERE gear_item.category_id IN ( SELECT children.id AS children_id FROM children); -- JOIN variant SELECT count(gear_item.id) AS count_recursive_child_items_for_single_cat FROM gear_item, children WHERE gear_item.category_id = children.id;
但是,如果您查看 CTE,我已將起始類別 ID 硬編碼為“37”。我不知道如何組合這些查詢來為所有類別生成 count_recursive_child_items,而不僅僅是一個類別。
我如何結合這些?
此外,目前對於每個類別,我都會計算所有子類別,這會產生很多重複的工作,我不知道如何刪除它。例如,假設我有祖父母 > 父母 > 葉。目前我分別計算祖父母和父母的子類別,這意味著我兩次計算父>葉關係。
而且由於我已經
count_direct_child_items
為每個類別返回,因此在計算時使用它們可能會更快,count_recursive_child_items
而不是像我現在那樣從頭開始計算。另外,這些概念中的每一個對我來說都是有意義的。我只是不知道如何將它們組合成一個優雅/優化的查詢。
這可以完成工作:
CREATE MATERIALIZED VIEW speedy_materialized_view AS WITH RECURSIVE tree AS ( SELECT id, parent_id, ARRAY[id] AS path FROM gear_category WHERE parent_id IS NULL UNION ALL SELECT c.id, c.parent_id, path || c.id FROM tree t JOIN gear_category c ON c.parent_id = t.id ) , tree_ct AS ( SELECT t.id, t.path, COALESCE(i.item_ct, 0) AS item_ct FROM tree t LEFT JOIN ( SELECT category_id AS id, count(*) AS item_ct FROM gear_item GROUP BY 1 ) i USING (id) ) SELECT t.id , t.item_ct AS count_direct_child_items , sum(t1.item_ct) AS count_recursive_child_items FROM tree_ct t LEFT JOIN tree_ct t1 ON t1.path[1:array_upper(t.path, 1)] = t.path GROUP BY t.id, t.item_ct;
count_recursive_child_items
每個類別都是單獨計算的,所以我不相信這是深度樹最快的方法。但是,遞歸 CTE 中不允許使用聚合函式。
遞歸函式
嚴格來說,它確實是迭代的——但“遞歸”CTE 也是如此。
您可以建構一個使用臨時表的函式。您需要了解 plpgsql 的使用方式,否則需要解釋的內容太多。
CREATE OR REPLACE FUNCTION f_tree_ct() RETURNS TABLE (id int, count_direct_child_items int, count_recursive_child_items int) LANGUAGE plpgsql AS $func$ DECLARE _lvl int; BEGIN -- basic table with added path and count CREATE TEMP TABLE t1 AS WITH RECURSIVE tree AS ( SELECT c.id, c.parent_id, '{}'::int[] AS path, 0 AS lvl FROM gear_category c WHERE c.parent_id IS NULL UNION ALL SELECT c.id, c.parent_id, path || c.parent_id, lvl + 1 FROM tree t JOIN gear_category c ON c.parent_id = t.id ) , tree_ct AS ( SELECT t.id, t.parent_id, t.path, t.lvl, COALESCE(i.item_ct, 0) AS item_ct FROM tree t LEFT JOIN ( SELECT i.category_id AS id, count(*)::int AS item_ct FROM gear_item i GROUP BY 1 ) i USING (id) ) TABLE tree_ct; -- CREATE INDEX ON t1 (lvl); -- only for very deep trees SELECT INTO _lvl max(lvl) FROM t1; -- identify max lvl to start bottom up -- recursively aggregate each level in 2nd temp table CREATE TEMP TABLE t2 AS SELECT t1.id, t1.parent_id, t1.lvl , t1.item_ct , t1.item_ct AS sum_ct FROM t1 WHERE t1.lvl = _lvl; IF _lvl > 0 THEN FOR i IN REVERSE _lvl .. 1 LOOP INSERT INTO t2 SELECT t1.id, t1.parent_id, t1.lvl, t1.item_ct , CASE WHEN t2.sum_ct IS NULL THEN t1.item_ct ELSE t1.item_ct + t2.sum_ct END FROM t1 LEFT JOIN ( SELECT t2.parent_id AS id, sum(t2.sum_ct) AS sum_ct FROM t2 WHERE t2.lvl = i GROUP BY 1 ) t2 USING (id) WHERE t1.lvl = i - 1; END LOOP; END IF; RETURN QUERY -- only requested columns, unsorted SELECT t2.id, t2.item_ct, t2.sum_ct FROM t2; DROP TABLE t1, t2; -- to allow repeated execution in one transaction RETURN; END $func$;
CREATE MATERIALIZED VIEW
由於使用了臨時表,這不能包含在語句中。您可以使用它創建另一個(臨時)表,充當手動維護的“物化視圖”:CREATE TABLE speedy_materialized_view AS SELECT * FROM f_tree_ct();
或者,您可以
TRUNCATE speedy_materialized_view
在函式中直接寫入。該函式將RETURNS void
改為,或者您可以返回一些元資訊,如行數……db<>fiddle here
另外:
CTE 的遞歸項中的列別名僅用於文件,因為輸出列名稱僅由非遞歸項確定。