Postgresql

從 PostgreSQL 中的嵌套表中進行選擇的更好方法

  • June 28, 2020

我有以下模式,匹配表中有數百萬行,每個匹配都有兩側(客場和主場)。我想創建一個視圖,該視圖顯示有關將 id 替換為名稱的匹配項的最重要數據。名字可以改變,這就是教練和球隊分開表格的原因。

圖式

我想出了兩種方法,我對這兩種方法都不滿意。

CREATE VIEW "fumbbl"."matches" AS
 SELECT "fmid", "time",
   (SELECT "name" FROM "fumbbl"."division"
     WHERE "fdid" = m."fdid") "division",
   (SELECT "coachbracket" FROM "fumbbl"."side"
     WHERE "fmid" = m."fmid" AND "away" IS FALSE) "hbracket",
   (SELECT "name" FROM "fumbbl"."coach"
     WHERE "fcid" = (SELECT "fcid" FROM "fumbbl"."side"
       WHERE "fmid" = m."fmid" AND "away" IS FALSE)) "hcoach",
   (SELECT "name" FROM "fumbbl"."team"
     WHERE "ftid" = (SELECT "ftid" FROM "fumbbl"."side"
       WHERE "fmid" = m."fmid" AND "away" IS FALSE)) "hteam",
   (SELECT "kilotv" FROM "fumbbl"."side"
     WHERE "fmid" = m."fmid" AND "away" IS FALSE) "hktv",
   (SELECT "race" FROM "fumbbl"."team"
     WHERE "ftid" = (SELECT "ftid" FROM "fumbbl"."side"
       WHERE "fmid" = m."fmid" AND "away" IS FALSE)) "hrace",
   (SELECT "score" FROM "fumbbl"."side"
     WHERE "fmid" = m."fmid" AND "away" IS FALSE) "hscore",
   (SELECT "score" FROM "fumbbl"."side"
     WHERE "fmid" = m."fmid" AND "away" IS TRUE) "ascore",
   (SELECT "race" FROM "fumbbl"."team"
     WHERE "ftid" = (SELECT "ftid" FROM "fumbbl"."side"
       WHERE "fmid" = m."fmid" AND "away" IS TRUE)) "arace",
   (SELECT "kilotv" FROM "fumbbl"."side"
     WHERE "fmid" = m."fmid" AND "away" IS TRUE) "aktv",
   (SELECT "name" FROM "fumbbl"."team"
     WHERE "ftid" = (SELECT "ftid" FROM "fumbbl"."side"
       WHERE "fmid" = m."fmid" AND "away" IS TRUE)) "ateam",
   (SELECT "name" FROM "fumbbl"."coach"
     WHERE "fcid" = (SELECT "fcid" FROM "fumbbl"."side"
       WHERE "fmid" = m."fmid" AND "away" IS TRUE)) "acoach",
   (SELECT "coachbracket" FROM "fumbbl"."side"
     WHERE "fmid" = m."fmid" AND "away" IS TRUE) "abracket"
 FROM "fumbbl"."match" m ORDER BY "fmid" DESC;

這個問題是我猜它多次執行相同的子查詢,這使得它無效。然而,它在有限的行數上比第二個表現更好。但是,在對所有數據進行選擇或在 fmid 上進行升序選擇時,它會變慢。

第一行解釋:

"Index Scan using match_fmid_desc on match m  (cost=0.42..46098616.62 rows=289261 width=14)"

這裡使用的索引是

CREATE INDEX "match_fmid_desc" ON "fumbbl"."match" ("fmid" DESC);

第二種方法使用連接:

CREATE VIEW "fumbbl"."matches" AS
 SELECT
     m."fmid",
     m."time",
     (SELECT "name" FROM "fumbbl"."division"
        WHERE "fdid" = m."fdid") "division",
     "hbracket",
     (SELECT "name" FROM "fumbbl"."coach"
        WHERE "fcid" = hst."fcid") "hcoach",
     "hteam",
     "hktv",
     "hrace",
     "hscore",
     "ascore",
     "arace",
     "aktv",
     "ateam",
     (SELECT "name" FROM "fumbbl"."coach"
        WHERE "fcid" = ast."fcid") "acoach",
     "abracket"
   FROM (
     SELECT *
     FROM "fumbbl"."match"
     ) m
   JOIN (
     (SELECT
       "fmid", "away", "fcid", "ftid",
       "coachbracket" AS "hbracket",
       "kilotv" AS "hktv",
       "score" AS "hscore"
     FROM "fumbbl"."side") hside
     JOIN (
       SELECT
         "ftid",
         "name" AS "hteam",
         "race" AS "hrace"
       FROM "fumbbl"."team") ht
       ON (hside."ftid" = ht."ftid")
     ) hst
   ON (m."fmid" = hst."fmid" AND hst."away" IS FALSE)
   JOIN (
     (SELECT
       "fmid", "away", "fcid", "ftid",
       "coachbracket" AS "abracket",
       "kilotv" AS "aktv",
       "score" AS "ascore"
     FROM "fumbbl"."side") aside
     JOIN (
       SELECT
         "ftid",
         "name" AS "ateam",
         "race" AS "arace"
       FROM "fumbbl"."team") at
       ON (aside."ftid" = at."ftid")
     ) ast
   ON (m."fmid" = ast."fmid" AND ast."away" IS TRUE)
 ORDER BY "fmid" DESC;

第一行解釋:

"Sort  (cost=7238489.68..7239207.00 rows=286927 width=84)"

我的問題是它在提供單行之前連接了整個表。雖然在整個數據上表現更好。

我對 PostgreSQL 很陌生,所以也許我錯過了一種以有效方式執行此類查詢的方法。我很高興有一個迭代方法,每次迭代只獲取一次兩側的數據。

編輯:在歐文的回答之後,解釋值表現得很好,我仍然想嘗試另一種方式。我做了一個函式,它通過獲取匹配的邊欄位來完成大部分工作。該函式由視圖使用。

CREATE OR REPLACE FUNCTION fumbbl.get_sides_data(
 in "fmid" integer,
 out "hcoach" character varying,
 out "hcoachbracket" "fumbbl"."bracket",
 out "hteam" character varying,
 out "hrace" character varying,
 out "hkilotv" smallint,
 out "hscore" smallint,
 out "acoach" character varying,
 out "acoachbracket" "fumbbl"."bracket",
 out "ateam" character varying,
 out "arace" character varying,
 out "akilotv" smallint,
 out "ascore" smallint
) AS
$$
DECLARE
 sfcid integer;
 sftid integer;
BEGIN
 SELECT s."fcid", s."coachbracket", s."kilotv", s."ftid", s."score"
   from fumbbl.side s
   WHERE (s."fmid" = $1 AND s."away" IS FALSE) INTO sfcid, $3, $6, sftid, $7;
 SELECT c."name" from fumbbl.coach c WHERE c."fcid" = sfcid INTO $2;
 SELECT t."name", t."race" from fumbbl.team t WHERE t."ftid" = sftid INTO $4, $5;

 SELECT s."fcid", s."coachbracket", s."kilotv", s."ftid", s."score"
   from fumbbl.side s
   WHERE (s."fmid" = $1 AND s."away" IS TRUE) INTO sfcid, $9, $12, sftid, $13;
 SELECT c."name" from fumbbl.coach c WHERE c."fcid" = sfcid INTO $8;
 SELECT t."name", t."race" from fumbbl.team t WHERE t."ftid" = sftid INTO $10, $11;
END;
$$ LANGUAGE plpgsql;

CREATE OR REPLACE VIEW fumbbl.matches4 AS
SELECT
m.fmid,
m.time,
(SELECT "name" FROM "fumbbl"."division" WHERE "fdid" = m."fdid") "division",
s."hcoach",
s."hcoachbracket",
s."hteam",
s."hrace",
s."hkilotv",
s."hscore",
s."acoach",
s."acoachbracket",
s."ateam",
s."arace",
s."akilotv",
s."ascore"
FROM fumbbl.match m, fumbbl.get_sides_data(m.fmid) s
ORDER  BY m.fmid DESC;

結論:

我意識到解釋不能衡量呼叫函式的視圖的有效性。所以我最終將選項(包括歐文的)與計時器進行了比較。毫不奇怪,選項 1(許多選擇)和選項 3(函式呼叫者)在帶有發燒行的選擇中表現更好。選項 3 似乎在我的數據庫中的 1-10 行中占主導地位。但是,對於數十萬行,JOIN 是一種方式。通過使用 JOIN,您必須等待整個操作才能獲得單行。令人驚訝的是,Erwin 的方式似乎和選項 2(嵌套連接)一樣快,這讓我覺得 PG 在後台做了某種優化。儘管如此,Erwin 的程式碼還是不錯且更易於維護。

在 Postgres 9.4 中,您可以使用聚合FILTER子句進行簡化:

CREATE VIEW fumbbl.matches AS
SELECT m.fmid, m.time, d.name AS division
    , min(s.coachbracket) FILTER (WHERE NOT s.away) AS hbracket
    , min(t.name)         FILTER (WHERE NOT s.away) AS hteam
    , min(c.name)         FILTER (WHERE NOT s.away) AS hcoach
    , min(s.score)        FILTER (WHERE NOT s.away) AS hscore
    , min(t.race)         FILTER (WHERE NOT s.away) AS hrace
    , min(s.kilotv)       FILTER (WHERE NOT s.away) AS hktv

    , min(s.coachbracket) FILTER (WHERE s.away) AS abracket
    , min(t.name)         FILTER (WHERE s.away) AS ateam
    , min(c.name)         FILTER (WHERE s.away) AS acoach
    , min(s.score)        FILTER (WHERE s.away) AS ascore
    , min(t.race)         FILTER (WHERE s.away) AS arace
    , min(s.kilotv)       FILTER (WHERE s.away) AS aktv
FROM   fumbbl.match    m
JOIN   fumbbl.division d USING (fdid)
JOIN   fumbbl.side     s USING (fmid)
JOIN   fumbbl.team     t USING (ftid)
JOIN   fumbbl.coach    c USING (fcid)
GROUP  BY m.fmid, d.fdid  -- PK columns
ORDER  BY m.fmid DESC;

每個表只加入一次。但它需要一個聚合步驟才能將主客場折疊成一排。

FILTER條款的詳細資訊:

引用自:https://dba.stackexchange.com/questions/94773