Oracle
使用分區表,如何使用提示分別對每個分區進行分組
假設我有下表
數據(partitioned_key_index,some_dummy_measure)
假設分區的大小相同。使用 Oracle 11g。
最終結果應該是這樣的
select partitioned_key_index, sum(some_dummy_measure) from data group by partitioned_key_index
每個分區將被獨立分組,優化器應該
足夠聰明,以提出一個計劃,其中每個分區將被聚合,然後一個簡單的“聯合所有”以獲得所需的輸出。
我想做的是接近這個
select 1 as partitioned_key_index, sum(some_dummy_measure) from data where partitioned_key_index = 1 Union All select 2 as partitioned_key_index, sum(some_dummy_measure) from data where partitioned_key_index = 2 Union All . . . select i as partitioned_key_index, sum(some_dummy_measure) from data where partitioned_key_index = i
我對上述方法的直覺是序列化雜湊組操作
,因此每個分區將從磁碟移動到緩衝區記憶體,希望不會溢出到磁碟進行分組操作。
任何想法如何調整這種查詢?
數據庫預設執行此操作,無需提示/調整。
create table data(partitioned_key_index number, some_dummy_measure number) partition by list (partitioned_key_index) ( partition p1 values (1), partition p2 values (2), partition p3 values (3), partition p4 values (4), partition p5 values (5) ); insert into data with g as (select * from dual connect by level <= 1000) select mod(rownum, 5) + 1, rownum from g,g where rownum <= 500000; commit;
然後執行查詢:
alter session set statistics_level=all; select partitioned_key_index, sum(some_dummy_measure) from data group by partitioned_key_index; PARTITIONED_KEY_INDEX SUM(SOME_DUMMY_MEASURE) --------------------- ----------------------- 1 25000250000 2 24999850000 3 24999950000 4 25000050000 5 25000150000
檢查發生了什麼:
SQL> select * from table(dbms_xplan.display_cursor(format=>'allstats last')); PLAN_TABLE_OUTPUT ----------------------------------------------------------------------------------------------------------------- SQL_ID 537qapda9hdy4, child number 0 ------------------------------------- select partitioned_key_index, sum(some_dummy_measure) from data group by partitioned_key_index Plan hash value: 3405952922 ----------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem | ----------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 5 |00:00:00.14 | 1065 | | | | | 1 | PARTITION LIST ALL | | 1 | 585K| 5 |00:00:00.14 | 1065 | | | | | 2 | HASH GROUP BY | | 5 | 585K| 5 |00:00:00.14 | 1065 | 34M| 6473K| 738K (0)| | 3 | TABLE ACCESS FULL| DATA | 5 | 585K| 500K|00:00:00.06 | 1065 | | | | ----------------------------------------------------------------------------------------------------------------- Note ----- - dynamic sampling used for this statement (level=2)
PARTITION LIST ALL
對所有分區都進行了以下操作。我們有 5 個分區,包括GROUP BY
. 從Starts
列中可以看出,HASH GROUP BY
確實執行了 5 次。這是它的樣子,當
GROUP BY
一次對整個表執行時:select (partitioned_key_index + 0), sum(some_dummy_measure) from data group by (partitioned_key_index + 0); (PARTITIONED_KEY_INDEX+0) SUM(SOME_DUMMY_MEASURE) ------------------------- ----------------------- 1 25000250000 2 24999850000 5 25000150000 4 25000050000 3 24999950000 SQL> select * from table(dbms_xplan.display_cursor(format=>'allstats last')); PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- SQL_ID 2xxf42mtp53sc, child number 0 ------------------------------------- select (partitioned_key_index + 0), sum(some_dummy_measure) from data group by (partitioned_key_index + 0) Plan hash value: 3651737839 ----------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem | ----------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 5 |00:00:00.24 | 1065 | | | | | 1 | HASH GROUP BY | | 1 | 585K| 5 |00:00:00.24 | 1065 | 34M| 6473K| 4574K (0)| | 2 | PARTITION LIST ALL| | 1 | 585K| 500K|00:00:00.14 | 1065 | | | | | 3 | TABLE ACCESS FULL| DATA | 5 | 585K| 500K|00:00:00.06 | 1065 | | | | ----------------------------------------------------------------------------------------------------------------- Note ----- - dynamic sampling used for this statement (level=2) 20 rows selected.
HASH GROUP BY
在從所有分區收集數據後,僅對全部數據執行一次。