Optimization
查詢語句的“合併”是否會提高查詢性能?
假設我們有一組郵政地址 A1, A2, A3, … 的形式:
A1 = 124, "Smith", "Rd" A2 = 142, "Jones", "St" A3 = 8190, "Smith", "Ln" A4 = 142, "Via Gracia", NULL A5 = 190, "Jones", "St" A6 = 31425, "Wedgewood", "Cir" A7 = 810, "Via Gracia", NULL ...
其中 A_N 是 3 個分量的向量:StreetNumberN、StreetNameN、StreetSuffixN。
沒有兩個向量在組件之間是相等的,但是許多向量共享 StreetName 和 StreetSuffix。
使用循環將此數據集直接轉換為 SQL 條件,可能會獲得:
WHERE (StreetNumber = 124 AND StreetName = "Smith" AND StreetSuffix = "Rd") OR (StreetNumber = 142 AND StreetName = "Jones" AND StreetSuffix = "St") OR (StreetNumber = 8190 AND StreetName = "Smith" AND StreetSuffix = "Ln") OR (StreetNumber = 142 AND StreetName = "Via Gracia") OR (StreetNumber = 190 AND StreetName = "Jones" AND StreetSuffix = "St") OR (StreetNumber = 31425 AND StreetName = "Wedgewood" AND StreetSuffix = "Cir") OR (StreetNumber = 810 AND StreetName = "Via Gracia")
現在,當 StreetName 和 StreetSuffix 匹配兩個條件語句時,邏輯上可以“合併” StreetNumbers:
WHERE (StreetNumber = 124 AND StreetName = "Smith" AND StreetSuffix = "Rd") OR (StreetNumber IN(142, 190) AND (StreetName = "Jones" AND StreetSuffix = "St")) OR (StreetNumber = 8190 AND StreetName = "Smith" AND StreetSuffix = "Ln") OR (StreetNumber IN (142,810) AND StreetName = "Via Gracia") OR (StreetNumber = 31425 AND StreetName = "Wedgewood" AND StreetSuffix = "Cir")
請記住,這只是一個展示我所說的“整合”原則的玩具範例。在實踐中,會有很多這樣的“合併”要執行。
如上所述,整合 StreetNumbers 是否有/應該有任何性能優勢?
合理的問題。讓我直接告訴您查詢的各個組件的相對成本。
OR
是去優化器。這些公式中的任何一個都將以這種方式工作:
- 掃描整個表
- 對於每一行,檢查
WHERE
子句- 將評估大部分或所有 OR 部分以消除給定的行,從而使您的建議略有改進。
掃描表並獲取每一行是最昂貴的部分。評估
WHERE
是一個次要部分。也就是說,從總體上看,您獲得的任何節省都將是微不足道的。而且,專注於擺脫
OR
可能會提供更多收益。如果表中有一百萬行,那麼這個公式將執行得更快:
SELECT .. FROM .. WHERE (StreetNumber = 124 AND StreetName = "Smith" AND StreetSuffix = "Rd") UNION ALL SELECT .. FROM .. WHERE (StreetNumber = 142 AND StreetName = "Jones" AND StreetSuffix = "St") UNION ALL SELECT .. FROM .. WHERE (StreetNumber = 8190 AND StreetName = "Smith" AND StreetSuffix = "Ln") ...
但有這樣的問題。您將需要一個索引:
INDEX(StreetNumber, StreetName, StreetSuffix) -- (in any order)
如果可能有重複,則使用
UNION DISTINCT
而不是UNION ALL
.對於最多幾百行,只需編寫對您來說最簡單的任何內容(例如,
OR
無需優化)。您很快就會發現您的地址模型是不完整的。“Apt. 123”或“c/o …”怎麼樣。等等。可能沒有充分的理由將 a 拆分
address
為多列 - 只需一個字元串,所有部分都以便於顯示的方式編寫。消除這AND
將是一個微小的優化。