Query-Performance

WITH-clause 沒有預先計算,線性資源浪費,為什麼?

  • December 14, 2020

我正在嘗試節省資源並縮短程式碼,但使用“WITH-clause”會浪費線性資源。

這是我的基本範例:

#!/bin/bash

echo "CREATE TABLE 'clients_table' ( id integer primary key, \
                       name            text DEFAULT 'noname', \
                       param1          int  DEFAULT 0, \
                       param2          int  DEFAULT 0, \
                       param3          int  DEFAULT 0, \
                       param4          int  DEFAULT 0, \
                       param5          int  DEFAULT 0 \
                       );"


# dimension=100
# dimension=20
dimension=15    # 1m16s
# dimension=10  # 16 seconds.
# dimension=2
for i1 in `seq 1 $dimension`;do
       echo "$(date): i1: $i1" > /dev/stderr
       for i2 in `seq 1 $dimension`;do
               for i3 in `seq 1 $dimension`;do
                       for i4 in `seq 1 $dimension`;do
                               for i5 in `seq 1 $dimension`;do
                                       echo "INSERT INTO clients_table ( name, param1, param2, param3, param4, param5 ) VALUES ( \
                                               'c${i1}_${i2}_${i3}_${i4}_${i5}', $i1, $i2, $i3, $i4, $i5 );"
                               done
                       done
               done
       done
done

在 1 分 16 秒內,維度為 15,我通過以下命令獲取 sqlite 數據庫:

# time bash gen_base.sh | sqlite3 ram_fs/tmp.db
...
real    1m16.769s
user    1m23.690s
sys     0m53.379s
# du -hs ram_fs/tmp.db 
23M     ram_fs/tmp.db

此後,我將這樣的 sql 用於不同的客戶端,通過每個參數,一點點來自這個,一點點來自那個……

WITH work_set AS (
   SELECT * FROM clients_table
       WHERE   param1 % 2 == 0
       ORDER BY    param1 ASC,
               param2 DESC,
               param3 ASC,
               param4 DESC,
               param5 ASC
),
odd2        AS ( SELECT * FROM work_set WHERE param2 % 2 == 1 ),
even2       AS ( SELECT * FROM work_set WHERE param2 % 2 == 0 ),
/* --------------------------- */
odd2odd3    AS ( SELECT * FROM odd2     WHERE param3 % 2 == 1 ),
odd2even3   AS ( SELECT * FROM odd2     WHERE param3 % 2 == 0 ),
even2odd3   AS ( SELECT * FROM even2    WHERE param3 % 2 == 1 ),
even2even3  AS ( SELECT * FROM even2    WHERE param3 % 2 == 0 ),
/*-----------------------------*/
odd2odd3odd4    AS ( SELECT * FROM odd2odd3 WHERE param4 % 2 == 1 ),
odd2odd3even4   AS ( SELECT * FROM odd2odd3 WHERE param4 % 2 == 0 ),
odd2even3odd4   AS ( SELECT * FROM odd2even3    WHERE param4 % 2 == 1 ),
odd2even3even4  AS ( SELECT * FROM odd2even3    WHERE param4 % 2 == 0 ),
/*--*/
even2odd3odd4   AS ( SELECT * FROM even2odd3    WHERE param4 % 2 == 1 ),
even2odd3even4  AS ( SELECT * FROM even2odd3    WHERE param4 % 2 == 0 ),
even2even3odd4  AS ( SELECT * FROM even2even3   WHERE param4 % 2 == 1 ),
even2even3even4 AS ( SELECT * FROM even2even3   WHERE param4 % 2 == 0 ),
/* --------------------------- */
c1      AS ( SELECT * FROM odd2odd3odd4 WHERE param5 % 2 == 1 LIMIT 1 ),
c2      AS ( SELECT * FROM odd2odd3odd4 WHERE param5 % 2 == 0 LIMIT 1 ),
/*--*/
c3      AS ( SELECT * FROM odd2odd3even4 WHERE param5 % 2 == 1 LIMIT 1 ),
c4      AS ( SELECT * FROM odd2odd3even4 WHERE param5 % 2 == 0 LIMIT 1 ),
/*--*/
c5      AS ( SELECT * FROM odd2even3odd4 WHERE param5 % 2 == 1 LIMIT 1 ),
c6      AS ( SELECT * FROM odd2even3odd4 WHERE param5 % 2 == 0 LIMIT 1 ),
/*--*/
c7      AS ( SELECT * FROM odd2even3even4 WHERE param5 % 2 == 1 LIMIT 1 ),
c8      AS ( SELECT * FROM odd2even3even4 WHERE param5 % 2 == 0 LIMIT 1 ),
/*--*/
c9      AS ( SELECT * FROM even2odd3odd4 WHERE param5 % 2 == 1 LIMIT 1 ),
c10     AS ( SELECT * FROM even2odd3odd4 WHERE param5 % 2 == 0 LIMIT 1 ),
/*--*/
c11     AS ( SELECT * FROM even2odd3even4 WHERE param5 % 2 == 1 LIMIT 1 ),
c12     AS ( SELECT * FROM even2odd3even4 WHERE param5 % 2 == 0 LIMIT 1 ),
/*--*/
c13     AS ( SELECT * FROM even2even3odd4 WHERE param5 % 2 == 1 LIMIT 1 ),
c14     AS ( SELECT * FROM even2even3odd4 WHERE param5 % 2 == 0 LIMIT 1 ),
/*--*/
c15     AS ( SELECT * FROM even2even3even4 WHERE param5 % 2 == 1 LIMIT 1 ),
c16     AS ( SELECT * FROM even2even3even4 WHERE param5 % 2 == 0 LIMIT 1 )
/* --------------------------- */
SELECT * FROM c1 UNION ALL
SELECT * FROM c2 UNION ALL
SELECT * FROM c3 UNION ALL
SELECT * FROM c4 UNION ALL
SELECT * FROM c5 UNION ALL
SELECT * FROM c6 UNION ALL
SELECT * FROM c7 UNION ALL
SELECT * FROM c8 UNION ALL
SELECT * FROM c9 /*UNION ALL
SELECT * FROM c10 UNION ALL
SELECT * FROM c11 UNION ALL
SELECT * FROM c12 UNION ALL
SELECT * FROM c13 UNION ALL
SELECT * FROM c14 UNION ALL
SELECT * FROM c15 UNION ALL
SELECT * FROM c16
*/

我正在嘗試/*在 sql 末尾移動註釋的開頭,以區分 UNION 的數量並測量時間:

# time cat with_select_bug.sql  | sqlite3 ram_fs/tmp.db 
  • 0m0.092s - 只有一個客戶端。
  • 0m0.183s - 兩個客戶端。
  • 0m0.274s - 三個客戶端。
  • 0m0.678s - 八個客戶端。
  • 0m1.336s - 所有十六個客戶端。

正如您所看到的,CPU 時間使用率線性增長,一個客戶端 - 比兩個客戶端快兩倍,所有客戶端 - 想要從我的 CPU 中佔用整整一秒。為什麼?

我已經聲明了WITH-clause,對吧?-SELECT應預先計算,所有客戶端應僅過濾先前計算數據的結果。


此外,如果我多次詢問同一個客戶:

...
SELECT * FROM c16 UNION ALL
SELECT * FROM c16 UNION ALL
SELECT * FROM c16 UNION ALL
SELECT * FROM c16 UNION ALL
...

每次選擇都會給我同一個客戶,但是每個人都會分開計算!!像c16- 的聲明一樣不存在……

我可以放 100 行SELECT * FROM c16 UNION ALL,它會比 1 慢 100 倍。為什麼這樣?

問題:

  • 為什麼增長是線性的?
  • 我怎樣才能提高速度?
  • **是“WITH-clause” - 只是為了縮短程式碼?**像宏一樣,沒有加速功能……

您所指的“WITH 子句”是CTE。它允許您多次重用程式碼但只編寫一次。

您的程式碼有一個我從未見過表現良好的模式 - 多個嵌套的 CTE。您編寫的每個 CTE 都會執行多次 - 每次呼叫它時執行一次。如果您發布您的執行計劃,它應該確認這一點。

這篇文章比我更雄辯地解釋了,多次呼叫同一個 CTE 將如何導致優化器在幕後多次執行 CTE。

是“WITH-clause” - 只是為了縮短程式碼?像宏一樣,沒有加速功能……

我會說 CTE 是一種可讀性工具,如果你願意的話,它是一種語法糖。它不僅沒有任何“加速功能”,而且實際上具有與上述文章相反的效果。

我怎樣才能提高速度?

很難從程式碼片段中判斷您的程式碼試圖實現什麼,但可能值得研究CROSS APPLY文章中的解決方案,或者考慮使用只會執行一次相關讀取的臨時表,而不是 CTE。

您可以通過在程式碼開頭添加來驗證您所做的更改是否使事情變得更好SET STATISTICS IO, TIME ON- 如果在您進行程式碼更改後您的讀取和 CPU 時間減少,您的程式碼性能更高。

引用自:https://dba.stackexchange.com/questions/281519