無法進一步調整數據庫;接下來是什麼?
我們有一個供應商提供的應用程序。它正在得到支持,我們正在與他們的開發人員交談(我們擁有他們最大的數據庫大小一個數量級),但與此同時,我們有以下查詢,每天執行數千次,唯一的部分改變了
ABCDEFG12345
在where
條款中。where’s 中的幾乎所有內容都是可定制的,因為它是由他們的主要工作搜尋生成的查詢。dJobs 表中大約有 500,000 條記錄,連接表中的記錄數量相似:
select * from ( select top 20 * from ( select top 20 ( --Start Count select count(*) from dJobs left join dClients on cltClientID = jobClientId left join dJobStatus ON jbsID = jobJobStatus where ((jobSupervisor= -1.00000000 or jobCoordinator= -1.00000000 ) OR 1=1 ) and ( (jobjobStatus=0 OR (0=0 AND 0=0)) AND (jbsType=0 OR (0=0 AND 0=0)) AND (jobPriority=0 OR (0=0 AND 0=0)) AND (jobEventID=0 OR (0=0 AND 0=0)) AND (jobSiteCode='' OR (''='' AND ''='')) AND (jobjobStatus IN ( select jbsid from djobstatusgroupmapping where jsgid = 0 ) OR (0=0 AND 0=0)) AND jobDateCreated BETWEEN '2004-06-10' AND '2014-06-10' AND jobBookedDate BETWEEN '1901-01-01 00:00:00' AND '2024-06-10 23:59:00' and (('ABCDEFG12345'<>'' AND ( (1 = 1 AND ( jobWorkOrderNo like '%ABCDEFG12345%' OR jobSiteName like '%ABCDEFG12345%' OR jobSiteLocationBuilding like '%ABCDEFG12345%' OR jobSiteAddress like '%ABCDEFG12345%' OR jobSiteSuburb like '%ABCDEFG12345%' OR jobSitePostcode like '%ABCDEFG12345%' OR jobWorkToDo like '%ABCDEFG12345%' OR jobSiteClient like '%ABCDEFG12345%' OR jobSiteContact like '%ABCDEFG12345%' OR jobSiteContactPhone like '%ABCDEFG12345%' OR jobSiteContactPhone2 like '%ABCDEFG12345%' ) ) OR (1 = 1 AND ( cltClientName like '%ABCDEFG12345%' OR cltDivision like '%ABCDEFG12345%' OR cltAddress1 like '%ABCDEFG12345%' OR cltAddress2 like '%ABCDEFG12345%' OR rtrim(cltAddress1)+' '+rtrim(cltAddress2) like '%ABCDEFG12345%' OR cltSuburb like '%ABCDEFG12345%' OR cltPostCode like '%ABCDEFG12345%' OR cltState like '%ABCDEFG12345%' OR cltTelephone like '%ABCDEFG12345%' OR cltContact like '%ABCDEFG12345%' OR cltMobile like '%ABCDEFG12345%' OR cltEmail like '%ABCDEFG12345%' ) ) ) ) OR 'ABCDEFG12345'='') ) --End Count ) as vRecCount, jobID, jobWorkOrderNo, jobSequence, jobSiteClient, jobSiteStreetNumber, jobSiteAddress, jobSiteSuburb, jobSiteState, jobSitePostcode, jobDateofLoss, jobDateofLossTime, jobDateCreated, jobTargetDate, jobClientID, jobPriority, jobJobStatus, jobEventID, regFullname, case when (not jobSupervisor= -1.00000000 ) and jobCoordinator= -1.00000000 then 1 else case when (jobSupervisor= -1.00000000 ) and (jobCoordinator=0) then 2 else case when (jobSupervisor= -1.00000000 ) and (not jobCoordinator= -1.00000000 ) then 3 else 0 end end end as 'Coordinator', jobSupervisor, jobBookedDate, jobLat, jobLong, jobClaimAssistRequest, jobBooked, jobSiteContact, jobSiteContactPhone2, jobPCM from dJobs left join dClients on cltClientID = jobClientId left join dUsers on regUserId = jobCoordinator left join dJobStatus ON jbsID = jobJobStatus where ((jobSupervisor= -1.00000000 or jobCoordinator= -1.00000000 ) OR 1=1 ) and ( (jobjobStatus=0 OR (0=0 AND 0=0)) AND (jobPriority=0 OR (0=0 AND 0=0)) AND (jbsType=0 OR (0=0 AND 0=0)) AND (jobEventID=0 OR (0=0 AND 0=0)) AND (jobSiteCode='' OR (''='' AND ''='')) AND (jobjobStatus IN ( select jbsid from djobstatusgroupmapping where jsgid = 0 ) OR (0=0 AND 0=0)) ) AND jobDateCreated BETWEEN '2004-06-10' AND '2014-06-10' AND jobBookedDate BETWEEN '1901-01-01 00:00:00' AND '2024-06-10 23:59:00' and (('ABCDEFG12345'<>'' AND ( (1 = 1 AND ( jobWorkOrderNo like '%ABCDEFG12345%' OR jobSiteName like '%ABCDEFG12345%' OR jobSiteLocationBuilding like '%ABCDEFG12345%' OR jobSiteAddress like '%ABCDEFG12345%' OR rtrim(jobSiteStreetNumber)+' '+rtrim(jobSiteAddress) LIKE '%ABCDEFG12345%' OR jobSiteSuburb like '%ABCDEFG12345%' OR jobSitePostcode like '%ABCDEFG12345%' OR jobWorkToDo like '%ABCDEFG12345%' OR jobSiteClient like '%ABCDEFG12345%' OR jobSiteContact like '%ABCDEFG12345%' OR jobSiteContactPhone like '%ABCDEFG12345%' OR jobSiteContactPhone2 like '%ABCDEFG12345%' ) ) OR (1 = 1 AND ( cltClientName like '%ABCDEFG12345%' OR cltDivision like '%ABCDEFG12345%' OR cltAddress1 like '%ABCDEFG12345%' OR cltAddress2 like '%ABCDEFG12345%' OR rtrim(cltAddress1)+' '+rtrim(cltAddress2) like '%ABCDEFG12345%' OR cltSuburb like '%ABCDEFG12345%' OR cltPostCode like '%ABCDEFG12345%' OR cltState like '%ABCDEFG12345%' OR cltTelephone like '%ABCDEFG12345%' OR cltContact like '%ABCDEFG12345%' OR cltMobile like '%ABCDEFG12345%' OR cltEmail like '%ABCDEFG12345%' ) ) ) ) OR 'ABCDEFG12345'='') order by jobid desc , jobDateCreated desc ) as newtbl order by jobid asc , jobDateCreated asc ) as newtbl_2 order by jobid desc, jobDateCreated desc
SQL Server 數據庫優化顧問對我沒有任何建議 - 所以看起來開發人員在為查詢創建適當的索引和統計數據方面做得很好。
直到他們最終可以重構東西,這可能需要 6 個月以上,我們還需要做什麼?只是在 SQL 伺服器上投入更多資金(SSD、更多 RAM,也許企業版用於分區)?
交易成本細分為:
- 總 CPU 時間(毫秒): 12,237.70
- # 總邏輯 IO: 2,331,089
- # 平均邏輯 IO: 2,331,089.00
- # 邏輯讀取: 2,331,089
在 VMWare 虛擬機中執行的 SQL Server 2008 R2 Standard。
下面執行計劃的有趣部分。大部分時間花在聚集索引掃描上。
抱歉耽擱了; 以下統計數據來自我們的暫存伺服器,該伺服器的記錄數量是生產記錄數的 1/10,但也在較低級別的硬體上。執行計劃 XML ,
STATISTICS IO
:Table 'dClients'. Scan count 5, logical reads 705780, physical reads 9, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table 'dJobs'. Scan count 10, logical reads 23386, physical reads 16, read-ahead reads 10601, lob logical reads 186928, lob physical reads 578, lob read-ahead reads 0. Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table 'dUsers'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
如果不重構以使用全文函式(CONTAINS、FREETEXT 或它們的表等價物),全文將無濟於事。它也不適用於前導萬用字元。黑客是可用的,但基本上你將很難為全文編寫一個語義等效的查詢。對於未來,考慮重新設計具有詞幹(run、runner、running)和同義詞庫(jogger)的全文,這可以比兩個萬用字元更好地為您的搜尋提供服務。
除非您受記憶體限制,否則 SSD 不太可能幫助您。您的表(只有 50 萬條記錄)可能大部分時間都在記憶體中。你能確認 dJobs 表的大小和伺服器 RAM 嗎?
企業版可以在 64GB RAM/4 個插槽或 16 個核心的限制達到 8 個的情況下提供幫助,但您將需要一個非常強大的盒子來發現差異。例如,4 確實意味著您可以擁有 4 個四核處理器,總共 16 個核心,啟用 HT 後,您已經擁有 32 個邏輯處理器。對於這種類型的 OLTP 機器,一般推薦的伺服器 maxdop 無論如何都是 8。我認為這不太可能受益,因為您的查詢有更多基本問題,但您永遠不知道。
非聚集索引(特別是在 dJobs 上)不太可能有幫助,因為查詢在 SELECT 中具有來自該表的如此多的列,在 WHERE 子句中具有許多條件。非聚集索引必須如此廣泛才能覆蓋它實際上是聚集索引的副本,因此維護成本過高。由於查詢按 jobID DESC 排序,我考慮了一個降序索引,但沒有嘗試過。
分區,(僅限企業版)確實是一個很棒的功能,但同樣不太可能幫助您。我對 dbo.dJobs.jobJobStatus 列上的分區進行了快速調查,例如,我想您在任何時候都只有一小部分“活動”作業,例如,500,000 條記錄中的幾百甚至幾千。OR OR OR 方法可能會取消分區消除。多個分區的並行掃描也是企業功能:
這可能會起作用:
SELECT TOP 20 * FROM dJobs LEFT JOIN dClients on cltClientID = jobClientId LEFT JOIN dUsers on regUserId = jobCoordinator LEFT JOIN dJobStatus ON jbsID = jobJobStatus WHERE ( jobjobStatus IN ( SELECT jbsid FROM djobstatusgroupmapping WHERE jsgid = 0 ) ) ORDER BY jobID DESC
這可能行不通:
SELECT TOP 20 * FROM dJobs LEFT JOIN dClients on cltClientID = jobClientId LEFT JOIN dUsers on regUserId = jobCoordinator LEFT JOIN dJobStatus ON jbsID = jobJobStatus WHERE ( jobjobStatus IN ( SELECT jbsid FROM djobstatusgroupmapping WHERE jsgid = 0 ) OR ( 0=0 ) OR ( 0=0 ) --<< this 'OR always true' means 'get the whole table' ) ORDER BY jobID DESC
這導致我進入查詢。OR OR OR 方法基本上意味著“總是得到整張桌子”。TOP 20 掩蓋了這個設計問題。TOP 也可能將計劃推向了 Jon 認為可疑的嵌套循環。對於這個噩夢般的“掃描所有列”構造的查詢,我還印象深刻的是,您基本上有兩個相同查詢的副本(因此還有表),一個用於計數,一個用於結果集。例如,如果數據進入中間表並從那裡完成計數,這可能會更有效。
最後,這讓我想到了唯一能真正幫助您的事情(無需大規模重構程式碼):數據刪除或存檔。如前所述,我想您在任何時候都只有一小部分“活躍”的工作。將“非活動”的切掉到不同的表中。在兩個表的頂部創建一個視圖以進行報告。設置一個夜間作業來複製舊記錄。
主表中只有幾千個活動作業很可能會改變您的查詢性能。
一些推薦閱讀:
Erland Sommarskog 關於這些“搜尋所有列”的文章 在 T-SQL 中查詢動態搜尋條件 http://www.sommarskog.se/dyn-search-2008.html
查詢多列(全文搜尋) http://technet.microsoft.com/en-us/library/ms142488(v=sql.105).aspx
我希望這會有所幫助!
我認為您仍然應該能夠降低讀數;您是否嘗試過將聚集索引掃描轉換為
dJobs
表上的非聚集索引掃描?很難從圖片上準確地說出你需要如何去做(至少),並且在整體性能方面它可能會以任何一種方式進行,但如果你不能或做並且它變得更糟,分區jobBookedDate
或jobDateCreated
更快的 IO 子系統看起來將是您唯一的選擇。