為什麼查詢成本變化如此之大以及如何預防
導致我問題的查詢太大了,但似乎是核心的部分很簡單,我會嘗試這樣做:
查詢的結構如下:
SELECT ... a lot of stuff ... WHERE ... lots of complex clauses ... AND my_id in ( select my_id from small_table where .. something simple ..)
這有幾十億的基數,**讀取大約 12gb 的字節並爆炸了我的溫度。
但是,如果我執行
select my_id from small_table
生成(總是)7 條記錄的 ,請獲取這些記錄並將查詢更改為:SELECT ... a lot of stuff ... WHERE ... lots of complex clauses ... AND my_id in ( 1, 2, 3, 4, 5, 6, 7 )
(我的意思是,值硬編碼)
成本、基數和讀取的字節數急劇下降,查詢在幾分鐘內執行。
現在,我試圖在一個
with
子句中隔離“小查詢”,嘗試使用連接而不是子查詢,什麼都沒有……結果總是一樣的。為什麼會這樣,我怎麼可能阻止它發生?
也許值得一提的是,在這兩種情況下(快速和慢速),查詢的代價高昂的部分是連接中使用的大表之一上的 FTS。
另外,我正在使用 Oracle 11gR2
**$$ EDIT $$**這些是兩個範例執行的解釋計劃
壞的。請注意,我沒有使用
in ( )
,而是在子句中添加small_table
了from
一個簡單的連接。Plan SELECT STATEMENT ALL_ROWSCost: 5,736,441 22 HASH JOIN RIGHT SEMI Cost: 5,736,441 Bytes: 52,324,480 Cardinality: 158,080 11 VIEW VIEW VW_NSO_1 Cost: 20 Bytes: 13 Cardinality: 1 10 NESTED LOOPS 8 NESTED LOOPS Cost: 18 Bytes: 91 Cardinality: 1 6 NESTED LOOPS Cost: 4 Bytes: 70 Cardinality: 1 4 TABLE ACCESS BY INDEX ROWID TABLE MYUSER.SMALL_TABLE Cost: 1 Bytes: 35 Cardinality: 1 3 INDEX UNIQUE SCAN INDEX (UNIQUE) MYUSER.PK_SMALL_TABLE Cost: 0 Cardinality: 1 2 TABLE ACCESS BY INDEX ROWID TABLE MYUSER.SMALL_TABLE Cost: 2 Bytes: 27 Cardinality: 1 1 INDEX RANGE SCAN INDEX MYUSER.IDX_SMALL_TABLE_1ATUAL Cost: 1 Cardinality: 6 5 TABLE ACCESS FULL TABLE MYUSER.SMALL_TABLE Cost: 3 Bytes: 35 Cardinality: 1 7 INDEX RANGE SCAN INDEX (UNIQUE) MYUSER.PK_MEDIUM_TABLE Cost: 1 Cardinality: 53 9 TABLE ACCESS BY INDEX ROWID TABLE MYUSER.MEDIUM_TABLE Cost: 14 Bytes: 420 Cardinality: 20 21 HASH JOIN Cost: 5,736,193 Bytes: 15,281,925,342 Cardinality: 48,056,369 19 NESTED LOOPS 17 NESTED LOOPS Cost: 951,151 Bytes: 500,185,440 Cardinality: 1,736,755 15 NESTED LOOPS Cost: 3 Bytes: 792 Cardinality: 22 13 TABLE ACCESS BY INDEX ROWID TABLE MYUSER.SMALL_TABLE Cost: 2 Bytes: 27 Cardinality: 1 12 INDEX RANGE SCAN INDEX MYUSER.IDX_SMALL_TABLE_1ATUAL Cost: 1 Cardinality: 6 14 INDEX RANGE SCAN INDEX (UNIQUE) MYUSER.PK_MEDIUM_TABLE Cost: 1 Bytes: 477 Cardinality: 53 16 INDEX RANGE SCAN INDEX MYUSER.IDX_HUGE_TABLE_1 Cost: 18,849 Cardinality: 1,322,763 18 TABLE ACCESS BY INDEX ROWID TABLE MYUSER.HUGE_TABLE Cost: 413,818 Bytes: 19,620,720 Cardinality: 77,860 20 TABLE ACCESS FULL TABLE MYUSER.HUGE_TABLE Cost: 3,958,129 Bytes: 12,063,594,150 Cardinality: 402,119,805
好的一個。我在這裡所做的只是獲取
select my_id from small_table where .. something simple ..
(七條記錄)的結果,並my_id in ( 1, 2, 3, 4, 5, 6, 7)
在大查詢的末尾添加了一個。與描述相同:Plan SELECT STATEMENT ALL_ROWSCost: 4,558,125 18 HASH JOIN Cost: 4,558,125 Bytes: 36,100,625 Cardinality: 122,375 16 NESTED LOOPS 14 NESTED LOOPS Cost: 413,809 Bytes: 5,271,671 Cardinality: 122,597 12 VIEW VIEW VW_NSO_1 Cost: 20 Bytes: 13 Cardinality: 1 11 HASH UNIQUE Bytes: 91 Cardinality: 1 10 NESTED LOOPS 8 NESTED LOOPS Cost: 18 Bytes: 91 Cardinality: 1 6 NESTED LOOPS Cost: 4 Bytes: 70 Cardinality: 1 4 TABLE ACCESS BY INDEX ROWID TABLE MYUSER.SMALL_TABLE Cost: 1 Bytes: 35 Cardinality: 1 3 INDEX UNIQUE SCAN INDEX (UNIQUE) MYUSER.PK_SMALL_TABLE Cost: 0 Cardinality: 1 2 TABLE ACCESS BY INDEX ROWID TABLE MYUSER.SMALL_TABLE Cost: 2 Bytes: 27 Cardinality: 1 1 INDEX RANGE SCAN INDEX MYUSER.IDX_SMALL_TABLE_1ATUAL Cost: 1 Cardinality: 5 5 TABLE ACCESS FULL TABLE MYUSER.SMALL_TABLE Cost: 3 Bytes: 35 Cardinality: 1 7 INDEX RANGE SCAN INDEX (UNIQUE) MYUSER.PK_MEDIUM_TABLE Cost: 1 Cardinality: 53 9 TABLE ACCESS BY INDEX ROWID TABLE MYUSER.MEDIUM_TABLE Cost: 14 Bytes: 420 Cardinality: 20 13 INDEX RANGE SCAN INDEX MYUSER.IDX_HUGE_TABLE_1 Cost: 18,849 Cardinality: 1,322,763 15 TABLE ACCESS BY INDEX ROWID TABLE MYUSER.HUGE_TABLE Cost: 413,788 Bytes: 3,677,910 Cardinality: 122,597 17 TABLE ACCESS FULL TABLE MYUSER.HUGE_TABLE Cost: 3,962,804 Bytes: 3,655,562,652 Cardinality: 14,506,201
希望能幫助到你。
您的表很可能具有最新的統計資訊,但有時優化器會感到困惑,因為它過度簡化了基數估計。
這似乎是動態採樣的一個很好的候選。在其預設值(10g 和 11g 中的 2)中,僅當表之一沒有統計資訊時才會使用動態採樣。在您的情況下,您需要更改其值才能讓優化器收集統計資訊以建構更好的計劃。
我建議您使用
DYNAMIC_SAMPLING
可以讓您修改單個查詢的優化器行為的提示。我使用子查詢進行了測試,您需要使用以下語法之一:
- 直接在查詢頂部的完整提示,這將對所有表進行採樣,這肯定會起作用,但可能會花費太多時間。
SELECT /*+ DYNAMIC_SAMPLING (10) */ FROM ...
- 子查詢的完整提示:
SELECT ... a lot of stuff ... WHERE ... lots of complex clauses ... AND my_id in ( select /*+ DYNAMIC_SAMPLING (10) */ my_id from small_table where .. something simple ..)
- 帶有查詢塊名稱的提示:
SELECT /*+ DYNAMIC_SAMPLING (@my_block 10) */ ... a lot of stuff ... WHERE ... lots of complex clauses ... AND my_id in ( select /*+ QB_NAME(my_block) */ my_id from small_table where .. something simple ..)
第二個和第三個選項應該產生相同的結果:只在一張桌子上取樣。
好計劃和壞計劃之間的主要區別在於,壞計劃預計會返回大表中的大部分/所有數據,而好計劃預計會返回大約 5% 的數據。
硬編碼值可以產生這種差異的原因是優化器明確知道您正在尋找哪些值。當您加入另一個表時,它知道預期有多少行,但不一定知道這些行的值。
在下面的範例中,有一個小表,其
RW
值為 1-9 whereVAL = 'N'
和 aRW = 10
和VAL = 'Y'
。大錶鍊接到小表,但是絕大多數行的RW值都是10。如果我們想在大表中找到不常見的RW值(1-9),那麼我們可以連接到小表,其中這些行有VAL = 'N'
,或者只是明確地列出了數字 1-9。無論哪種情況,我們都希望從大表中返回 9 行。在連接的情況下,雖然優化器辨識出針對的查詢
SMALL_TAB
返回 9 行,但它無法準確確定這些如何連結到LARGE_TAB
. 因此,它對預期的 10,000 行進行全表掃描,而不是 9 行索引範圍掃描:create table small_tab as select rownum rw, decode(rownum, 10, 'Y', 'N') val from dual connect by level <= 10; create table large_tab as select case when rownum >= 10 then 10 else rownum end rw, --9990 rows have same value dbms_random.string('x', 20) filler from dual connect by level <= 10000; create index sm_i on small_tab(rw); create index lg_i on large_tab(rw); --calc stats, SIZE 10 to ensure we get histograms on the large table exec dbms_stats.gather_table_stats(user, 'small_tab', cascade => true); exec dbms_stats.gather_table_stats(user, 'large_tab', method_opt => 'FOR ALL COLUMNS SIZE 10', cascade => true); explain plan for SELECT * FROM large_tab l where l.rw in (select s.rw from small_tab s where val = 'N'); SELECT * FROM table(dbms_xplan.display(NULL, NULL, 'BASIC +ROWS')); -- the optimizer doesn't know how the SMALL_TAB.RW values returned by the -- subquery match back to LARGE_TAB.RW, so assumes we need to full scan the whole table -------------------------------------------------- | Id | Operation | Name | Rows | -------------------------------------------------- | 0 | SELECT STATEMENT | | 5000 | | 1 | HASH JOIN RIGHT SEMI| | 5000 | | 2 | TABLE ACCESS FULL | SMALL_TAB | 5 | | 3 | TABLE ACCESS FULL | LARGE_TAB | 10000 | -------------------------------------------------- explain plan for SELECT * FROM large_tab where rw IN (1, 2, 3, 4, 5, 6, 7, 8, 9); SELECT * FROM table(dbms_xplan.display(NULL, NULL, 'BASIC +ROWS')); --because we've explicity asked for values, these can be directly compared --against the stats information and the optimizer knows we should expect 9 rows. ---------------------------------------------------------- | Id | Operation | Name | Rows | ---------------------------------------------------------- | 0 | SELECT STATEMENT | | 9 | | 1 | INLIST ITERATOR | | | | 2 | TABLE ACCESS BY INDEX ROWID| LARGE_TAB | 9 | | 3 | INDEX RANGE SCAN | LG_I | 9 | ----------------------------------------------------------
為了幫助您解決此問題,如果查詢是靜態的(不會更改)並且您已獲得調整包的許可,我建議您查看 SQL 調整顧問。有了這個,您可以創建一個 SQL 配置文件,將查詢鎖定到一個好的計劃。希望顧問會自動為您找到好的計劃,但如果沒有,您可能必須手動創建它。一些連結:
使用優化顧問包:http ://www.oracle-base.com/articles/10g/automatic-sql-tuning-10g.php 手動創建 SQL 配置文件:http: //kerryosborne.oracle-guy.com/2010/ 07/sqlt-coe_xfr_sql_profilesql/
如果您沒有獲得調整包的許可,那麼我認為您只能重構查詢、添加索引和擺弄提示。找出哪些更改將受益將是一些反複試驗;如果不訪問實際數據集,很難準確地說出您應該做什麼。