Optimization

查詢語句的“合併”是否會提高查詢性能?

  • June 10, 2019

假設我們有一組郵政地址 A1, A2, A3, … 的形式:

A1 = 124, "Smith", "Rd"
A2 = 142, "Jones", "St"
A3 = 8190, "Smith", "Ln"
A4 = 142, "Via Gracia", NULL
A5 = 190, "Jones", "St"
A6 = 31425, "Wedgewood", "Cir"
A7 = 810, "Via Gracia", NULL
...

其中 A_N 是 3 個分量的向量:StreetNumberN、StreetNameN、StreetSuffixN。

沒有兩個向量在組件之間是相等的,但是許多向量共享 StreetName 和 StreetSuffix。

使用循環將此數據集直接轉換為 SQL 條件,可能會獲得:

WHERE
   (StreetNumber = 124 AND StreetName = "Smith" AND StreetSuffix = "Rd")
OR
   (StreetNumber = 142 AND StreetName = "Jones" AND StreetSuffix = "St")
OR
   (StreetNumber = 8190 AND StreetName = "Smith" AND StreetSuffix = "Ln")
OR
   (StreetNumber = 142 AND StreetName = "Via Gracia")
OR
   (StreetNumber = 190 AND StreetName = "Jones" AND StreetSuffix = "St")
OR
   (StreetNumber = 31425 AND StreetName = "Wedgewood" AND StreetSuffix = "Cir")
OR
   (StreetNumber = 810 AND StreetName = "Via Gracia")

現在,當 StreetName 和 StreetSuffix 匹配兩個條件語句時,邏輯上可以“合併” StreetNumbers:

WHERE
   (StreetNumber = 124 AND StreetName = "Smith" AND StreetSuffix = "Rd")
OR
   (StreetNumber IN(142, 190) AND (StreetName = "Jones" AND StreetSuffix = "St"))
OR
   (StreetNumber = 8190 AND StreetName = "Smith" AND StreetSuffix = "Ln")
OR
   (StreetNumber IN (142,810) AND StreetName = "Via Gracia")
OR
   (StreetNumber = 31425 AND StreetName = "Wedgewood" AND StreetSuffix = "Cir")

請記住,這只是一個展示我所說的“整合”原則的玩具範例。在實踐中,會有很多這樣的“合併”要執行。

如上所述,整合 StreetNumbers 是否有/應該有任何性能優勢?

合理的問題。讓我直接告訴您查詢的各個組件的相對成本。

OR是去優化器。這些公式中的任何一個都將以這種方式工作:

  • 掃描整個
  • 對於每一行,檢查WHERE子句
  • 將評估大部分或所有 OR 部分以消除給定的行,從而使您的建議略有改進。

掃描表並獲取每一行是最昂貴的部分。評估WHERE是一個次要部分。

也就是說,從總體上看,您獲得的任何節省都將是微不足道的。而且,專注於擺脫OR 可能會提供更多收益。

如果表中有一百萬行,那麼這個公式將執行得更快:

SELECT .. FROM .. WHERE (StreetNumber = 124 AND StreetName = "Smith" AND StreetSuffix = "Rd")
UNION ALL
SELECT .. FROM .. WHERE (StreetNumber = 142 AND StreetName = "Jones" AND StreetSuffix = "St")
UNION ALL
SELECT .. FROM .. WHERE (StreetNumber = 8190 AND StreetName = "Smith" AND StreetSuffix = "Ln")
...

但有這樣的問題。您將需要一個索引:

INDEX(StreetNumber, StreetName, StreetSuffix)  -- (in any order)

如果可能有重複,則使用UNION DISTINCT而不是UNION ALL.

對於最多幾百行,只需編寫對您來說最簡單的任何內容(例如,OR無需優化)。

您很快就會發現您的地址模型是不完整的。“Apt. 123”或“c/o …”怎麼樣。等等。可能沒有充分的理由將 a 拆分address為多列 - 只需一個字元串,所有部分都以便於顯示的方式編寫。消除這AND將是一個微小的優化。

引用自:https://dba.stackexchange.com/questions/240205