Sql-Server

SQL 查詢優化 - 長時間執行的查詢

  • June 12, 2020

我有一個包含以下表格的數據庫:

CREATE TABLE [Document](
   [IdDocument] [int] NOT NULL,
   [DocumentType] [int] NOT NULL
   CONSTRAINT PK_Document PRIMARY KEY CLUSTERED (IdDocument)
)

CREATE TABLE [ExternalKey](
   [IdExternalKey] [int] NOT NULL,
   [RefDocument] [int] NOT NULL,
   [EntityType] [int] NOT NULL,
   [Value] [int] NOT NULL, 
   CONSTRAINT PK_ExternalKey PRIMARY KEY CLUSTERED (IdExternalKey)
   CONSTRAINT FK_RefDocument FOREIGN KEY (RefDocument) REFERENCES Document(IdDocument),
   CONSTRAINT UC_ExternalKey UNIQUE (RefDocument, EntityType, Value)
)
  • 文件映射到驅動器上的物理文件。每個文件都有一個類型(例如:2 = IDENTITY_CARD)
  • 這些文件與具有一對多關係的外部實體連結。每個外部鍵都有一個實體類型(例如:50 = PERSON)和一個 ID(例如:213235)

數據庫很大:文件是10M記錄。外鍵 40M。

ExternalKey 內的數據範例:

IdExternalKey  RefDocument EntityType  Value
1              1           50          3421
2              1           50          9524
3              1           60          7893
4              2           50          1752
5              2           50          8979 

我想根據文件的類型和與之鍊接的實體(簡化查詢)過濾文件:

WHERE (DocumentType IN (10, 20, ...) 
      AND ExternalKeys ANY (EntityType = 50 AND Value IN (4,5,6,7,8,9,...)))    
  OR (DocumentType IN (80, 90, ...) 
      AND ExternalKeys ANY (EntityType = 60 AND Value IN (110,120,130,...)))

我能想到的最好的是:

SELECT TOP 50 IdDocument
FROM Document d     
WHERE
   (d.DocumentType IN (SELECT Id FROM @listA1) AND d.IdDocument IN (SELECT ex.RefDocument FROM ExternalKey ex WHERE ex.EntityType = 60 AND ex.Value IN (SELECT Id FROM @listB1))) OR
   (d.DocumentType IN (SELECT Id FROM @listA2) AND d.IdDocument IN (SELECT ex.RefDocument FROM ExternalKey ex WHERE ex.EntityType = 61 AND ex.Value IN (SELECT Id FROM @listB2))) OR
   ...
   (d.DocumentType IN (SELECT Id FROM @listA3) AND d.IdDocument IN (SELECT ex.RefDocument FROM ExternalKey ex WHERE ex.EntityType = 59 AND ex.Value IN (SELECT Id FROM @listB3)))
ORDER BY IdDocument

表值參數包含很多ID(總共約100K) 表定義為:

CREATE TYPE int_list_type AS TABLE(Id int NOT NULL PRIMARY KEY)

查詢速度很慢,性能因排序而異(從十幾秒到幾分鐘不等)。例如,對文件進行IdDocument排序比對它們進行排序要快得多DocumentType。到目前為止我已經嘗試過:

  • RefDocument在, EntityType, Value(多列索引)上創建索引。我嘗試了不同的列順序。
  • 用查詢中的硬編碼值替換錶值參數(例如:@listA1):解析時間爆炸(如預期)。
  • 用 EXISTS 替換 IN :性能相似。用 JOIN 替換 IN :慢很多

是否有一些更有效的方法來執行過濾器,任何提示?

這是 2 個 OR 子句的查詢計劃(每個 OR 子句都是 RED) 在此處輸入圖像描述

問題後編輯得到更多細節。

我得到一個聚集索引掃描和一個索引掃描。但是我的表中沒有很多行。

CREATE TABLE [Document](
[IdDocument] [int] NOT NULL,
[DocumentType] [int] NOT NULL,
CONSTRAINT PK_Document PRIMARY KEY CLUSTERED (IdDocument)
)

CREATE TABLE [ExternalKey](
[IdExternalKey] [int] NOT NULL,
[RefDocument] [int] NOT NULL,
[EntityType] [int] NOT NULL,
[Value] [int] NOT NULL, 
CONSTRAINT PK_ExternalKey PRIMARY KEY CLUSTERED (IdExternalKey),
CONSTRAINT FK_RefDocument FOREIGN KEY (RefDocument) REFERENCES Document(IdDocument),
CONSTRAINT UC_ExternalKey UNIQUE (RefDocument, EntityType, Value)
)

-- Insert a few values
DECLARE @IdDocument INT, @DocumentType INT, @IdExternalKey INT;
Set @IdDocument=0;
SET @DocumentType = 10;
select @IdExternalKey=MAX(IdExternalKey)+1 from ExternalKey ;


set nocount on
WHILE (@IdDocument < 10000)
BEGIN
   INSERT INTO Document(IdDocument, DocumentType) VALUES (@IdDocument, @DocumentType);
   INSERT INTO [ExternalKey](IdExternalKey,RefDocument,EntityType,Value) VALUES (@IdExternalKey, @IdDocument, 50, 3421);
   INSERT INTO [ExternalKey](IdExternalKey,RefDocument,EntityType,Value) VALUES (@IdExternalKey+1, @IdDocument, 50, 9524);
   INSERT INTO [ExternalKey](IdExternalKey,RefDocument,EntityType,Value) VALUES (@IdExternalKey+2, @IdDocument, 60, 7893);
   INSERT INTO [ExternalKey](IdExternalKey,RefDocument,EntityType,Value) VALUES (@IdExternalKey+3, @IdDocument, 50, 1752);
   INSERT INTO [ExternalKey](IdExternalKey,RefDocument,EntityType,Value) VALUES (@IdExternalKey+4, @IdDocument, 50, 8979);
   INSERT INTO [ExternalKey](IdExternalKey,RefDocument,EntityType,Value) VALUES (@IdExternalKey+5, @IdDocument, 60, 7822);

   SET @DocumentType=CASE WHEN @DocumentType > 100 THEN 10 ELSE @DocumentType + 10 END;
   SET @IdDocument = @IdDocument + 1;
   SET @IdExternalKey = @IdExternalKey+ 6;
END

SELECT TOP 50 IdDocument
FROM Document d
JOIN ExternalKey e on e.RefDocument=d.IdDocument 
WHERE
(d.DocumentType IN (10,20,30) AND e.EntityType = 60 AND e.Value IN (4,5,6,7,8)) OR
(d.DocumentType IN (50) AND e.EntityType = 61 AND e.Value IN (110,120,130) OR
(d.DocumentType IN (80,90) AND e.EntityType = 59 AND e.Value IN (1,2,3)))
ORDER BY IdDocument

我要補充一點,如果 ORDER BY 不是必需的並且可以在程序中完成,它可以幫助加速查詢。

在此處輸入圖像描述

您真正想要改進查詢的方法是修復 ExternalKey 表的“索引掃描”或 Document 表的掃描。例如,既然 ExternalKey 是 RefDocument、entityType、Value 唯一的,也許可以考慮將其用作聚集索引?

現在的問題是,有了鍵和索引,SQL Server 別無選擇,只能掃描整個表以獲取所需的行。因此,如果您有一百萬行,即使結果只有 10 行,它也必須掃描每一行。

主要問題是查詢本身。

如果@listA1,@listB1表類型是強制搜尋條件,那麼它們只能放入表類型@listA,@listB

SELECT TOP 50 IdDocument
FROM Document d     
WHERE EXISTS
   ( SELECT Id FROM @listA A1 where a1.Id= d.DocumentType)
    AND EXISTS(
   SELECT ex.RefDocument FROM ExternalKey ex WHERE  ex.RefDocument=d.IdDocument and  ex.EntityTypein(60,61,62) AND ex.Value IN 
   (SELECT Id FROM @listB)) 
ORDER BY IdDocument

解釋您的搜尋要求和參數並修復您的查詢。

引用自:https://dba.stackexchange.com/questions/192433