Oracle

按 x 分組,也可以獲取其他欄位(比較選項和性能)

  • October 22, 2021

我是新手,試圖找到總結數據的最佳方法。甲骨文 19c。


我有一個 WORKORDER 表,其中包含工作訂單 (ISTASK=0) 和任務 (ISTASK=1)。

  • 任務是工單的子級。
  • 工作訂單和任務按 WOGROUP 分組。

成本分為四列:

  • actlabcost(實際人工成本)
  • actmatcost(實際材料成本)
  • acttoolcost(實際工具成本)
  • actservcost(實際服務成本)

成本列不可為空。所以我們不需要擔心將空值轉換為零以避免對空值進行數學運算。


Select 'WO1361' as WONUM, 'WO1361' as WOGROUP, 0 as ISTASK, 0 as ACTLABCOST, 0 as ACTMATCOST, 0 as ACTTOOLCOST, 0 as ACTSERVCOST, '167457977' as OTHER_WO_COLUMNS FROM DUAL
UNION ALL
Select 'WO1362' as WONUM, 'WO1362' as WOGROUP, 0 as ISTASK, 0 as ACTLABCOST, 0 as ACTMATCOST, 0 as ACTTOOLCOST, 0 as ACTSERVCOST, '167458280' as OTHER_WO_COLUMNS FROM DUAL
UNION ALL
Select 'WO1363' as WONUM, 'WO1363' as WOGROUP, 0 as ISTASK, 270.14 as ACTLABCOST, 0 as ACTMATCOST, 0 as ACTTOOLCOST, 0 as ACTSERVCOST, '167483430' as OTHER_WO_COLUMNS FROM DUAL
UNION ALL
Select 'WO1367' as WONUM, 'WO1363' as WOGROUP, 1 as ISTASK, 540.27 as ACTLABCOST, 0 as ACTMATCOST, 0 as ACTTOOLCOST, 0 as ACTSERVCOST, '167482806' as OTHER_WO_COLUMNS FROM DUAL
UNION ALL
Select 'WO1370' as WONUM, 'WO1363' as WOGROUP, 1 as ISTASK, 202.6 as ACTLABCOST, 0 as ACTMATCOST, 0 as ACTTOOLCOST, 0 as ACTSERVCOST, '167483431' as OTHER_WO_COLUMNS FROM DUAL
UNION ALL
Select 'WO1364' as WONUM, 'WO1364' as WOGROUP, 0 as ISTASK, 88.86 as ACTLABCOST, 0 as ACTMATCOST, 0 as ACTTOOLCOST, 0 as ACTSERVCOST, '167459454' as OTHER_WO_COLUMNS FROM DUAL
UNION ALL
Select 'WO1366' as WONUM, 'WO1364' as WOGROUP, 1 as ISTASK, 33.77 as ACTLABCOST, 0 as ACTMATCOST, 0 as ACTTOOLCOST, 0 as ACTSERVCOST, '167458946' as OTHER_WO_COLUMNS FROM DUAL
UNION ALL
Select 'WO1365' as WONUM, 'WO1365' as WOGROUP, 0 as ISTASK, 67.53 as ACTLABCOST, 0 as ACTMATCOST, 0 as ACTTOOLCOST, 0 as ACTSERVCOST, '167459331' as OTHER_WO_COLUMNS FROM DUAL
UNION ALL
Select 'WO1368' as WONUM, 'WO1368' as WOGROUP, 0 as ISTASK, 236.37 as ACTLABCOST, 0 as ACTMATCOST, 0 as ACTTOOLCOST, 0 as ACTSERVCOST, '167461627' as OTHER_WO_COLUMNS FROM DUAL

WONUM  WOGROUP     ISTASK ACTLABCOST ACTMATCOST ACTTOOLCOST ACTSERVCOST OTHER_WO_COLUMNS
------ ------- ---------- ---------- ---------- ----------- ----------- ----------------
WO1361 WO1361           0          0          0           0           0 167457977       
WO1362 WO1362           0          0          0           0           0 167458280       
WO1363 WO1363           0     270.14          0           0           0 167483430       
WO1367 WO1363           1     540.27          0           0           0 167482806       
WO1370 WO1363           1      202.6          0           0           0 167483431       
WO1364 WO1364           0      88.86          0           0           0 167459454       
WO1366 WO1364           1      33.77          0           0           0 167458946       
WO1365 WO1365           0      67.53          0           0           0 167459331       
WO1368 WO1368           0     236.37          0           0           0 167461627       


Notice rows 3-5 are in WOGROUP #WO1363. And rows 6-7 are in WOGROUP #WO1364.
                                                                            

問題:

我想按 WOGROUP 匯總工單成本(包括任務成本),但我不想在結果集中顯示任務行。換句話說,我想將任務成本匯總到他們的父工單。

對於工單行,我還想包括未分組的其他列(即 OTHER_WO_COLUMNS)。


我找到了幾種方法。

選項 #1:(GROUP BY、JOIN 和 SUM)

該查詢在子查詢中執行 GROUP BY (SUM) 以獲取總工單成本。然後它選擇工作訂單(不包括任務)並加入子查詢以引入總成本。

--The suffix "_ti" stands for "tasks included".
select
   a.wonum,
   a.istask,
   b.actlabcost_ti,
   b.actmatcost_ti,
   b.actservcost_ti,
   b.acttoolcost_ti,
   b.acttotalcost_ti,
   other_wo_columns
from
   cte a
left join
   (
   select 
       wogroup as wonum, 
       sum(actlabcost)  as actlabcost_ti,
       sum(actmatcost)  as actmatcost_ti,
       sum(actservcost) as actservcost_ti,
       sum(acttoolcost) as acttoolcost_ti,
       sum(actlabcost + actmatcost + actservcost + acttoolcost) as acttotalcost_ti
   from 
       cte 
   group by 
       wogroup
   ) b
   on a.wonum = b.wonum
where
   istask = 0

WONUM      ISTASK ACTLABCOST_TI ACTMATCOST_TI ACTSERVCOST_TI ACTTOOLCOST_TI ACTTOTALCOST_TI OTHER_WO_COLUMNS
------ ---------- ------------- ------------- -------------- -------------- --------------- ----------------
WO1361          0             0             0              0              0               0 167457977       
WO1362          0             0             0              0              0               0 167458280       
WO1363          0       1013.01             0              0              0         1013.01 167483430       
WO1364          0        122.63             0              0              0          122.63 167459454       
WO1365          0         67.53             0              0              0           67.53 167459331       
WO1368          0        236.37             0              0              0          236.37 167461627  

我在一個完整的生產表(WORKORDER 表有 4,500 行)上執行了查詢並得到了這個解釋計劃:

Plan hash value: 1879239811

---------------------------------------------------------------------------------
| Id  | Operation           | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |           |  9841 |   586K|   622   (2)| 00:00:01 |
|   1 |  SORT GROUP BY      |           |  9841 |   586K|   622   (2)| 00:00:01 |
|*  2 |   HASH JOIN OUTER   |           |  9841 |   586K|   620   (1)| 00:00:01 |
|*  3 |    TABLE ACCESS FULL| WORKORDER |  4609 |   184K|   310   (1)| 00:00:01 |
|   4 |    TABLE ACCESS FULL| WORKORDER |  9841 |   192K|   310   (1)| 00:00:01 |
---------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

  2 - access("A"."WONUM"="WOGROUP"(+))
  3 - filter("A"."ISTASK"=0)

估計費用為622


選項 #2:(SUM 分析函式)

我找到了一種使用 SUM分析函式的方法。我將 SUM 分析函式查詢包裝在隱藏任務行的外部查詢中。

--The suffix "_ti" stands for "tasks included".  
select
   wonum,
   istask,
   actlabcost_ti,
   actmatcost_ti,
   acttoolcost_ti,
   actservcost_ti,
   acttotalcost_ti,
   other_wo_columns
from
   (
   select
       wogroup as wonum,
       istask,
       sum(actlabcost ) over (partition by wogroup) as actlabcost_ti,
       sum(actmatcost ) over (partition by wogroup) as actmatcost_ti,
       sum(acttoolcost) over (partition by wogroup) as acttoolcost_ti,
       sum(actservcost) over (partition by wogroup) as actservcost_ti,
       sum(actlabcost + actmatcost + acttoolcost + actservcost) over (partition by wogroup) as acttotalcost_ti,
       other_wo_columns
   from
       cte
   )
where
   istask=0

WONUM      ISTASK ACTLABCOST_TI ACTMATCOST_TI ACTTOOLCOST_TI ACTSERVCOST_TI ACTTOTALCOST_TI OTHER_WO_COLUMNS
------ ---------- ------------- ------------- -------------- -------------- --------------- ----------------
WO1361          0             0             0              0              0               0 167457977       
WO1362          0             0             0              0              0               0 167458280       
WO1363          0       1013.01             0              0              0         1013.01 167483430       
WO1364          0        122.63             0              0              0          122.63 167459454       
WO1365          0         67.53             0              0              0           67.53 167459331       
WO1368          0        236.37             0              0              0          236.37 167461627    

我也在生產中執行了這個查詢,並得到了這個解釋計劃:

Plan hash value: 2003557620

---------------------------------------------------------------------------------
| Id  | Operation           | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |           |  9841 |  1749K|   312   (2)| 00:00:01 |
|*  1 |  VIEW               |           |  9841 |  1749K|   312   (2)| 00:00:01 |
|   2 |   WINDOW SORT       |           |  9841 |   394K|   312   (2)| 00:00:01 |
|   3 |    TABLE ACCESS FULL| WORKORDER |  9841 |   394K|   310   (1)| 00:00:01 |
---------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

  1 - filter("ISTASK"=0)

估計成本為312,大約是第一次查詢的一半

我認為它更快,因為它只進行一次全表掃描(另一個查詢進行了兩次全掃描)。


問題:

總結這些數據的最佳/最快方法是什麼?

如前所述,我注意到選項#2 比#1 快。但老實說,#2 對我來說似乎有點倒退。我在 Stack Overflow 上對這種方法提出了一些非常嚴厲的批評,所以我猜這不是建構查詢的好方法。

我的明確偏好是選項#2。恕我直言,這很直接。您對每個分區進行聚合,然後只保留您感興趣的行。由於沒有框架子句 ( BETWEEN) 的視窗函式的計算得到了很好的優化,並不是您做了很多事然後就扔掉了。

儘管如此,還是有一些關於選項 #1 的回饋:

  • 你為什麼採取left join(不是內部連接?)
  • other_wo_columns您可以使用選擇性聚合來代替自連接:

MAX(CASE WHEN istask = 0 THEN other_wo_column END)

如果每組只有一行istask = 0,它將只返回該行的值。我使用max它是因為它適用於所有數據類型——比如min——但不是sum.

當然,如果你有很多列,程式碼會變得有點混亂。

這很可能是最快的方法,但與#2 相比可能差距不大。

有關此技術的更多資訊:

最後,關於 COST

不要使用COST作為衡量不同查詢方法的指標。COST 值用於確定在具有相同數據的相同數據庫中的相同查詢的各種可能性中的最佳****可能執行計劃。COST 值的計算已針對此特定目的進行了高度優化——即,出於性能原因,不需要服務於此特定目的的所有內容都將被跳過。因此,使用不同的東西很有可能給出錯誤的結果。

它只是不用於比較不同的查詢

引用自:https://dba.stackexchange.com/questions/299932