Sql-Server

改進 DbGeography 查詢

  • December 22, 2015

我還是數據庫管理的新手,我正在嘗試優化搜尋查詢。

我有一個看起來像這樣的查詢,在某些情況下執行需要 5-15 秒,並且還導致 100% 的 CPU 使用率:

DECLARE @point geography;
SET @point = geography::STPointFromText('POINT(3.3109015 6.648294)', 4326); 

SELECT TOP (1)
    [Result].[PointId] AS [PointId], 
    [Result].[PointName] AS [PointName], 
    [Result].[LegendTypeId] AS [LegendTypeId], 
    [Result].[GeoPoint] AS [GeoPoint]
FROM ( 
   SELECT 
       [Extent1].[GeoPoint].STDistance(@point) AS distance, 
       [Extent1].[PointId] AS [PointId], 
       [Extent1].[PointName] AS [PointName], 
       [Extent1].[LegendTypeId] AS [LegendTypeId], 
       [Extent1].[GeoPoint] AS [GeoPoint]
   FROM [dbo].[GeographyPoint] AS [Extent1]
   WHERE 18 = [Extent1].[LegendTypeId] 
)  AS [Result]
ORDER By [Result].distance ASC

該表在 PK 上有一個聚集索引,在geography類型列上有一個空間索引。

在此處輸入圖像描述

因此,當我執行上述查詢時,它正在執行掃描操作。

在此處輸入圖像描述

所以我在列上創建了一個非聚集索引LegendTypeId

CREATE NONCLUSTERED INDEX [GeographyPoint_LegendType_NonClustered] ON [dbo].[GeographyPoint]
(
   [LegendTypeId] ASC
)
INCLUDE (   [PointId],
   [PointName],
   [GeoPoint]) 
   WITH (PAD_INDEX = OFF, 
   STATISTICS_NORECOMPUTE = OFF,
   SORT_IN_TEMPDB = OFF, 
   DROP_EXISTING = OFF,
   ONLINE = OFF,
   ALLOW_ROW_LOCKS = ON, 
   ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO

並將查詢更改為:

DECLARE @point geography;
SET @point = geography::STPointFromText('POINT({0} {1})', 4326); 

SELECT TOP (1) 
    [GeoPoint].STDistance(@point) AS distance, 
    [PointId], 
    [PointName],
    [LegendTypeId], 
    [GeoPoint]
    FROM [GeographyPoint]
WHERE 18 = [LegendTypeId]
ORDER By distance ASC

現在 SQL Server 執行搜尋而不是掃描:

在此處輸入圖像描述

在我看來,這提高了查詢的效率,但是當我將它部署到生產環境時,我仍然得到相同的結果(CPU 使用率高,執行查詢平均需要 10 秒)。

注意:不會從該表中插入、更新或刪除數據——僅搜尋/讀取。

  1. 是我做錯了什麼嗎?
  2. 我怎樣才能解決這個問題?

編輯

Index Seak 詳細資訊

在此處輸入圖像描述

編輯2:

我更改了查詢,以使用連結中的方法:“最近鄰”:https ://msdn.microsoft.com/en-us/library/ff929109.aspx ,現在這是結果,此查詢也需要 3搜尋 -5 秒 - 類似於第二個查詢,(但未在生產環境中測試)

在此處輸入圖像描述

空間索引設置:

CREATE SPATIAL INDEX [SPATIAL_Point] ON [dbo].[GeographyPoint]
(
[GeoPoint]
)USING  GEOGRAPHY_GRID 
WITH (GRIDS =(LEVEL_1 = MEDIUM,LEVEL_2 = MEDIUM,LEVEL_3 = MEDIUM,LEVEL_4 = MEDIUM), 
CELLS_PER_OBJECT = 16, PAD_INDEX = OFF, 
STATISTICS_NORECOMPUTE = 
OFF, SORT_IN_TEMPDB = OFF,
DROP_EXISTING = OFF, 
ONLINE = OFF, 
ALLOW_ROW_LOCKS = ON, 
ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO

編輯 3 我按照@MickyT 的指示,刪除了索引[LegendTypeId],並執行了以下查詢:

DECLARE @point geography;
SET @point = geography::STPointFromText('POINT(3.3109 6.6482)', 4326); 

SELECT TOP (1) 

   [PointId],
   [PointName],
   [LegendTypeId], 
   [GeoPoint]
FROM [GeographyPoint] WITH(INDEX(SPATIAL_Point))
WHERE 
  [GeoPoint].STDistance(@point) IS NOT NULL AND
   18 = [LegendTypeId]
ORDER By [GeoPoint].STDistance(@point) ASC
OPTION(MAXDOP 1)

此查詢的統計資訊是

在此處輸入圖像描述

然後我再次執行了這個查詢:

DECLARE @point geography;
SET @point = geography::STPointFromText('POINT(3.3109 6.6482)', 4326); 

SELECT TOP (1) 
    [GeoPoint].STDistance(@point) AS distance, 
    [PointId], 
    [PointName],
    [LegendTypeId], 
    [GeoPoint]
    FROM [GeographyPoint] --WITH(INDEX(SPATIAL_Point))
WHERE 18 = [LegendTypeId]
ORDER By distance ASC

此查詢的統計資訊是

在此處輸入圖像描述

我使用以下設置來執行一些測試。

CREATE TABLE GeographyPoint (
   ID INTEGER IDENTITY(1,1) NOT NULL PRIMARY KEY,
   GeoPoint GEOGRAPHY NOT NULL,
   LegendTypeID INTEGER NOT NULL
   );

INSERT INTO GeographyPoint (GeoPoint, LegendTypeID)
SELECT TOP 1000000 
   Geography::Point(RAND(CAST(NEWID() AS VARBINARY(MAX))) * 2,RAND(CAST(NEWID() AS VARBINARY(MAX))) * 2,4326),
   CAST(RAND(CAST(NEWID() AS VARBINARY(MAX))) * 25 AS INTEGER)
FROM Tally;

CREATE INDEX GP_IDX1 ON GeographyPoint(LegendTypeID) INCLUDE (ID, GeoPoint);
CREATE SPATIAL INDEX GP_SIDX ON GeographyPoint(GeoPoint) USING GEOGRAPHY_AUTO_GRID;

這給出了一個包含 1,000,000 個隨機點的表,分佈為 2 x 2 度。

在嘗試了幾個不同的選項之後,我能得到的最佳性能是強制它使用空間索引。有幾種方法可以實現這一目標。刪除 LegendTypeID 上的索引或使用提示。
您將需要決定哪個最適合您的情況。就我個人而言,我不喜歡使用索引提示,如果其他查詢不需要它,我會刪除其他索引。

查詢相互疊加

DECLARE @point geography;
SET @point = geography::Point(1,1,4326); 
/*
Clustered index scan (PK)
SQL Server Execution Times:
  CPU time = 641 ms,  elapsed time = 809 ms
*/
SELECT TOP (1) 
   [GeoPoint].STDistance(@point) AS distance, 
   [ID], 
   [LegendTypeId], 
   [GeoPoint]
FROM [GeographyPoint]
WHERE 18 = [LegendTypeId]
ORDER By distance ASC
OPTION(MAXDOP 1)
/*
Index Seek NonClustered (GP_IDX1)
SQL Server Execution Times:
  CPU time = 2250 ms,  elapsed time = 2806 ms
*/
SELECT TOP (1) 
   [GeoPoint].STDistance(@point) AS distance, 
   [ID], 
   [LegendTypeId], 
   [GeoPoint]
FROM [GeographyPoint]
WHERE [GeoPoint].STDistance(@point) IS NOT NULL AND
   18 = [LegendTypeId]
ORDER By [GeoPoint].STDistance(@point) ASC
OPTION(MAXDOP 1)

/*
For the next 2 queries
Clustered Index Seek (Spatial)
SQL Server Execution Times:
  CPU time = 15 ms,  elapsed time = 11 ms
*/
SELECT TOP (1) 
   [GeoPoint].STDistance(@point) AS distance, 
   [ID], 
   [LegendTypeId], 
   [GeoPoint]
FROM [GeographyPoint] WITH(INDEX(GP_SIDX))
WHERE [GeoPoint].STDistance(@point) IS NOT NULL AND
   18 = [LegendTypeId]
ORDER By [GeoPoint].STDistance(@point) ASC
OPTION(MAXDOP 1)

DROP INDEX GP_IDX1 ON [GeographyPoint]

SELECT TOP (1) 
   [GeoPoint].STDistance(@point) AS distance, 
   [ID], 
   [LegendTypeId], 
   [GeoPoint]
FROM [GeographyPoint]
WHERE [GeoPoint].STDistance(@point) IS NOT NULL AND
   18 = [LegendTypeId]
ORDER By [GeoPoint].STDistance(@point) ASC
OPTION(MAXDOP 1)

引用自:https://dba.stackexchange.com/questions/124217