Oracle

使用分區表,如何使用提示分別對每個分區進行分組

  • September 10, 2020

假設我有下表

數據(partitioned_key_index,some_dummy_measure)

假設分區的大小相同。使用 Oracle 11g。

最終結果應該是這樣的

select partitioned_key_index, sum(some_dummy_measure) 
 from data group by partitioned_key_index 

每個分區將被獨立分組,優化器應該

足夠聰明,以提出一個計劃,其中每個分區將被聚合,然後一個簡單的“聯合所有”以獲得所需的輸出。

我想做的是接近這個

select 1 as partitioned_key_index, sum(some_dummy_measure) 
 from data where partitioned_key_index = 1  
Union All
select 2 as partitioned_key_index, sum(some_dummy_measure) 
 from data where partitioned_key_index = 2
Union All
 .
 .
 .
select i as partitioned_key_index, sum(some_dummy_measure) 
 from data where partitioned_key_index = i

我對上述方法的直覺是序列化雜湊組操作

,因此每個分區將從磁碟移動到緩衝區記憶體,希望不會溢出到磁碟進行分組操作。

任何想法如何調整這種查詢?

數據庫預設執行此操作,無需提示/調整。

create table data(partitioned_key_index number, some_dummy_measure number)
 partition by list (partitioned_key_index)
(
    partition p1 values (1),
    partition p2 values (2),
    partition p3 values (3),
    partition p4 values (4),
    partition p5 values (5)
);

insert into data with g as (select * from dual connect by level <= 1000) 
select mod(rownum, 5) + 1, rownum from g,g where rownum <= 500000;
commit;

然後執行查詢:

alter session set statistics_level=all;

select partitioned_key_index, sum(some_dummy_measure) from data 
 group by partitioned_key_index;


PARTITIONED_KEY_INDEX SUM(SOME_DUMMY_MEASURE)
--------------------- -----------------------
                   1             25000250000
                   2             24999850000
                   3             24999950000
                   4             25000050000
                   5             25000150000

檢查發生了什麼:

SQL> select * from table(dbms_xplan.display_cursor(format=>'allstats last'));

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------
SQL_ID  537qapda9hdy4, child number 0
-------------------------------------
select partitioned_key_index, sum(some_dummy_measure) from data group
by partitioned_key_index

Plan hash value: 3405952922

-----------------------------------------------------------------------------------------------------------------
| Id  | Operation           | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |      1 |        |      5 |00:00:00.14 |    1065 |       |       |          |
|   1 |  PARTITION LIST ALL |      |      1 |    585K|      5 |00:00:00.14 |    1065 |       |       |          |
|   2 |   HASH GROUP BY     |      |      5 |    585K|      5 |00:00:00.14 |    1065 |    34M|  6473K|  738K (0)|
|   3 |    TABLE ACCESS FULL| DATA |      5 |    585K|    500K|00:00:00.06 |    1065 |       |       |          |
-----------------------------------------------------------------------------------------------------------------

Note
-----
  - dynamic sampling used for this statement (level=2)

PARTITION LIST ALL對所有分區都進行了以下操作。我們有 5 個分區,包括GROUP BY. 從Starts列中可以看出,HASH GROUP BY確實執行了 5 次。

這是它的樣子,當GROUP BY一次對整個表執行時:

select (partitioned_key_index + 0), sum(some_dummy_measure) from data
   group by (partitioned_key_index + 0);

(PARTITIONED_KEY_INDEX+0) SUM(SOME_DUMMY_MEASURE)
------------------------- -----------------------
                       1             25000250000
                       2             24999850000
                       5             25000150000
                       4             25000050000
                       3             24999950000

SQL> select * from table(dbms_xplan.display_cursor(format=>'allstats last'));

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  2xxf42mtp53sc, child number 0
-------------------------------------
select (partitioned_key_index + 0), sum(some_dummy_measure) from data
group by (partitioned_key_index + 0)

Plan hash value: 3651737839

-----------------------------------------------------------------------------------------------------------------
| Id  | Operation           | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |      1 |        |      5 |00:00:00.24 |    1065 |       |       |          |
|   1 |  HASH GROUP BY      |      |      1 |    585K|      5 |00:00:00.24 |    1065 |    34M|  6473K| 4574K (0)|
|   2 |   PARTITION LIST ALL|      |      1 |    585K|    500K|00:00:00.14 |    1065 |       |       |          |
|   3 |    TABLE ACCESS FULL| DATA |      5 |    585K|    500K|00:00:00.06 |    1065 |       |       |          |
-----------------------------------------------------------------------------------------------------------------

Note
-----
  - dynamic sampling used for this statement (level=2)


20 rows selected.

HASH GROUP BY在從所有分區收集數據後,僅對全部數據執行一次。

引用自:https://dba.stackexchange.com/questions/223140