Sql-Server

Xml 索引,行數緩慢

  • October 29, 2019

在以下情況下,我沒有得到索引搜尋。<Number>xxx</Number>而不是像這篇文章那樣插入xml 列為什麼當 where 子句過濾 value() 時不使用二級選擇性索引?,插入 100k 行與此 xml<SomeText>NiceText</SomeText>和類似數量的行<SomeText>MoreText</SomeText>。不需要100k。只需要很多。然後添加索引

create selective xml index SIX_T on dbo.T(XMLDoc) for
(
   pathXQUERY = '/SomeText' as xquery 'xs:string' maxlength(8) singleton
);

和二級索引

create xml index SIX_T_pathXQUERY on dbo.T(XMLDoc)
 using xml index SIX_T for (pathXQUERY);

然後數一數

select count(*)
from dbo.T as T
where T.XMLDoc.exist('/SomeText[. eq "MoreText"]') = 1;

請注意,它不使用索引搜尋並且“慢”。數百萬行可能需要幾秒鐘。如果我在標準列中插入相同的值並向其添加索引並執行

select count(id)
   from dbo.T as T where SomeTextColumn = 'MoreText'

我立即得到結果。在 sql server 18.3.1 上完成的所有測試

問題是,如何讓 xml 計數與按列計數一樣快?

謝謝

數據差異

由於要查找的記錄數量很少,優化器可以使用SIX_T_pathXQUERY索引:

在此處輸入圖像描述

moretext並使用搜尋謂詞進行過濾:

在此處輸入圖像描述

這裡有趣的部分是它執行鍵查找以獲取不為空的 path_1_id 值。

在此處輸入圖像描述

因為這是非聚集 xml 索引上的過濾器定義……

在此處輸入圖像描述

…雖然不存在於索引本身中。

結果,對於考慮使用索引的優化器,它知道它必須完成以下步驟:

  • 使用內部表上的二級索引過濾 XML 值
  • SIX_T將這些返回值與內部表上的聚集索引匹配並過濾,path_1_id is not null因為path_1_id不包含在二級索引中
  • 將這些值與實際表匹配,dbo.T以返回要依賴的 ID

臨界點

當要返回的預期行較高時,它傾向於使用選擇性聚集 XML 索引,以便能夠執行合併連接和無鍵查找:

在此處輸入圖像描述

帶有殘差謂詞:

在此處輸入圖像描述

過濾 xml 列和path_1_id

比較計劃

您可以(但您不應該)使用USE PLAN提示來強制執行計劃,並在 XML 列上進行搜尋,看看如果我們在哪裡搜尋這些值會發生什麼。

在此處輸入圖像描述

隨著執行時間=

  CPU time = 218 ms,  elapsed time = 215 ms.

以及掃描+合併連接計劃的執行時間:

  CPU time = 62 ms,  elapsed time = 58 ms.

簡而言之,我相信帶有殘差謂詞+合併連接選擇的掃描是優化器的正確選擇。

沒辦法

雖然我可能錯了,但我認為沒有辦法使用正常 XML 索引來改進計數查詢。我們也無法更改這些內部表,甚至無法查詢它們:

SELECT * FROM 
[sys].[xml_sxi_table_1463676262_256000];

它甚至在合併連接查詢計劃上提供了一個缺失的索引提示,您無法創建:

CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>]
ON [sys].[xml_sxi_table_1463676262_256000] ([pathXQUERY_1_value])
INCLUDE ([path_1_id])

要改進查詢,您必須研究非 xml 索引解決方案。


編輯

如何使用 USE PLAN 提示?您有此查詢的範常式式碼嗎?

我建議不要這樣做,但出於測試目的,它會很好。

**第 1 步:**您必須獲取查詢的實際執行計劃並獲取 xml:

在此處輸入圖像描述

如果您沒有低估計的查詢,請輸入一個不存在的值,例如:

select count(*)
from dbo.T as T
where T.XMLDoc.exist('/SomeText[. eq "bbb"]') = 1

**步驟2:**將執行計劃xml中的所有's替換為s,稍後我們將需要這個。''

在此處輸入圖像描述

**第 3 步:**粘貼計劃OPTION( USE PLAN '')

SELECT count(*)
FROM dbo.T as T
WHERE T.XMLDoc.exist('/SomeText[. eq "MoreText"]') = 1
OPTION(USE PLAN 
'');

**第 4 步:**我必須將 utf-16 更改為 utf-8 才能使使用計劃提示生效

從:

OPTION(USE PLAN
'<?xml version="1.0" encoding="utf-16"?>

到:

OPTION(USE PLAN
'<?xml version="1.0" encoding="utf-8"?>

**第 5 步:**執行查詢。

我的查詢現在看起來像這樣:

select count(*)
from dbo.T as T
where T.XMLDoc.exist('/SomeText[. eq "MoreText"]') = 1
OPTION(USE PLAN
'<?xml version="1.0" encoding="utf-8"?>
<ShowPlanXML xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" Version="1.481" Build="14.0.3223.3" xmlns="http://schemas.microsoft.com/sqlserver/2004/07/showplan">
 <BatchSequence>
   <Batch>
     <Statements>
       <StmtSimple StatementCompId="1" StatementEstRows="1" StatementId="1" StatementOptmLevel="FULL" StatementOptmEarlyAbortReason="GoodEnoughPlanFound" CardinalityEstimationModelVersion="70" StatementSubTreeCost="0.00986014" StatementText="select count(*)
from dbo.T as T
where T.XMLDoc.exist(''/SomeText[. eq "bbb"]'') = 1" StatementType="SELECT" QueryHash="0x412154B6AD55BBFC" QueryPlanHash="0x280B174BF20902E3" RetrievedFromCache="true" SecurityPolicyApplied="false">
         <StatementSetOptions ANSI_NULLS="true" ANSI_PADDING="true" ANSI_WARNINGS="true" ARITHABORT="true" CONCAT_NULL_YIELDS_NULL="true" NUMERIC_ROUNDABORT="false" QUOTED_IDENTIFIER="true" />
         <QueryPlan DegreeOfParallelism="1" CachedPlanSize="40" CompileTime="394" CompileCPU="301" CompileMemory="648">
           <MemoryGrantInfo SerialRequiredMemory="0" SerialDesiredMemory="0" />
           <OptimizerHardwareDependentProperties EstimatedAvailableMemoryGrant="419404" EstimatedPagesCached="52425" EstimatedAvailableDegreeOfParallelism="2" MaxCompileMemory="3655432" />
           <QueryTimeStats CpuTime="0" ElapsedTime="0" />
           <RelOp AvgRowSize="11" EstimateCPU="0" EstimateIO="0" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row" EstimateRows="1" LogicalOp="Compute Scalar" NodeId="0" Parallel="false" PhysicalOp="Compute Scalar" EstimatedTotalSubtreeCost="0.00986014">
             <OutputList>
               <ColumnReference Column="Expr1017" />
             </OutputList>
             <ComputeScalar>
               <DefinedValues>
                 <DefinedValue>
                   <ColumnReference Column="Expr1017" />
                   <ScalarOperator ScalarString="CONVERT_IMPLICIT(int,[Expr1020],0)">
                     <Convert DataType="int" Style="0" Implicit="true">
                       <ScalarOperator>
                         <Identifier>
                           <ColumnReference Column="Expr1020" />
                         </Identifier>
                       </ScalarOperator>
                     </Convert>
                   </ScalarOperator>
                 </DefinedValue>
               </DefinedValues>
               <RelOp AvgRowSize="11" EstimateCPU="1.1E-06" EstimateIO="0" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row" EstimateRows="1" LogicalOp="Aggregate" NodeId="1" Parallel="false" PhysicalOp="Stream Aggregate" EstimatedTotalSubtreeCost="0.00986014">
                 <OutputList>
                   <ColumnReference Column="Expr1020" />
                 </OutputList>
                 <RunTimeInformation>
                   <RunTimeCountersPerThread Thread="0" ActualRows="1" Batches="0" ActualEndOfScans="1" ActualExecutions="1" ActualExecutionMode="Row" ActualElapsedms="0" ActualCPUms="0" />
                 </RunTimeInformation>
                 <StreamAggregate>
                   <DefinedValues>
                     <DefinedValue>
                       <ColumnReference Column="Expr1020" />
                       <ScalarOperator ScalarString="Count(*)">
                         <Aggregate AggType="countstar" Distinct="false" />
                       </ScalarOperator>
                     </DefinedValue>
                   </DefinedValues>
                   <RelOp AvgRowSize="9" EstimateCPU="1E-06" EstimateIO="0" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row" EstimateRows="1" LogicalOp="Aggregate" NodeId="3" Parallel="false" PhysicalOp="Stream Aggregate" EstimatedTotalSubtreeCost="0.00985904">
                     <OutputList />
                     <RunTimeInformation>
                       <RunTimeCountersPerThread Thread="0" ActualRows="0" Batches="0" ActualEndOfScans="1" ActualExecutions="1" ActualExecutionMode="Row" ActualElapsedms="0" ActualCPUms="0" />
                     </RunTimeInformation>
                     <StreamAggregate>
                       <DefinedValues />
                       <GroupBy>
                         <ColumnReference Database="[adventureworks]" Schema="[dbo]" Table="[T]" Alias="[T]" Column="ID" />
                       </GroupBy>
                       <RelOp AvgRowSize="11" EstimateCPU="4.18E-06" EstimateIO="0" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row" EstimateRows="1" LogicalOp="Inner Join" NodeId="4" Parallel="false" PhysicalOp="Nested Loops" EstimatedTotalSubtreeCost="0.00985804">
                         <OutputList>
                           <ColumnReference Database="[adventureworks]" Schema="[dbo]" Table="[T]" Alias="[T]" Column="ID" />
                         </OutputList>
                         <RunTimeInformation>
                           <RunTimeCountersPerThread Thread="0" ActualRows="0" Batches="0" ActualEndOfScans="1" ActualExecutions="1" ActualExecutionMode="Row" ActualElapsedms="0" ActualCPUms="0" />
                         </RunTimeInformation>
                         <NestedLoops Optimized="false">
                           <OuterReferences>
                             <ColumnReference Database="[adventureworks]" Schema="[sys]" Table="[xml_sxi_table_1463676262_256000]" Alias="[SomeText:1]" Column="pk1" />
                           </OuterReferences>
                           <RelOp AvgRowSize="16" EstimateCPU="4.18E-06" EstimateIO="0" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row" EstimateRows="1" LogicalOp="Inner Join" NodeId="5" Parallel="false" PhysicalOp="Nested Loops" EstimatedTotalSubtreeCost="0.00657038">
                             <OutputList>
                               <ColumnReference Database="[adventureworks]" Schema="[sys]" Table="[xml_sxi_table_1463676262_256000]" Alias="[SomeText:1]" Column="pk1" />
                             </OutputList>
                             <RunTimeInformation>
                               <RunTimeCountersPerThread Thread="0" ActualRows="0" Batches="0" ActualEndOfScans="1" ActualExecutions="1" ActualExecutionMode="Row" ActualElapsedms="0" ActualCPUms="0" />
                             </RunTimeInformation>
                             <NestedLoops Optimized="false">
                               <OuterReferences>
                                 <ColumnReference Database="[adventureworks]" Schema="[sys]" Table="[xml_sxi_table_1463676262_256000]" Alias="[SomeText:1]" Column="pk1" />
                                 <ColumnReference Database="[adventureworks]" Schema="[sys]" Table="[xml_sxi_table_1463676262_256000]" Alias="[SomeText:1]" Column="row_id" />
                               </OuterReferences>
                               <RelOp AvgRowSize="15" EstimateCPU="0.0001581" EstimateIO="0.003125" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row" EstimateRows="1" EstimatedRowsRead="1" LogicalOp="Index Seek" NodeId="6" Parallel="false" PhysicalOp="Index Seek" EstimatedTotalSubtreeCost="0.0032831" TableCardinality="150002">
                                 <OutputList>
                                   <ColumnReference Database="[adventureworks]" Schema="[sys]" Table="[xml_sxi_table_1463676262_256000]" Alias="[SomeText:1]" Column="pk1" />
                                   <ColumnReference Database="[adventureworks]" Schema="[sys]" Table="[xml_sxi_table_1463676262_256000]" Alias="[SomeText:1]" Column="row_id" />
                                 </OutputList>
                                 <RunTimeInformation>
                                   <RunTimeCountersPerThread Thread="0" ActualRows="0" Batches="0" ActualEndOfScans="1" ActualExecutions="1" ActualExecutionMode="Row" ActualElapsedms="0" ActualCPUms="0" ActualScans="1" ActualLogicalReads="3" ActualPhysicalReads="0" ActualReadAheads="0" ActualLobLogicalReads="0" ActualLobPhysicalReads="0" ActualLobReadAheads="0" />
                                 </RunTimeInformation>
                                 <IndexScan Ordered="true" ScanDirection="FORWARD" ForcedIndex="false" ForceSeek="false" ForceScan="false" NoExpandHint="false" Storage="RowStore">
                                   <DefinedValues>
                                     <DefinedValue>
                                       <ColumnReference Database="[adventureworks]" Schema="[sys]" Table="[xml_sxi_table_1463676262_256000]" Alias="[SomeText:1]" Column="pk1" />
                                     </DefinedValue>
                                     <DefinedValue>
                                       <ColumnReference Database="[adventureworks]" Schema="[sys]" Table="[xml_sxi_table_1463676262_256000]" Alias="[SomeText:1]" Column="row_id" />
                                     </DefinedValue>
                                   </DefinedValues>
                                   <Object Database="[adventureworks]" Schema="[sys]" Table="[xml_sxi_table_1463676262_256000]" Index="[SIX_T_pathXQUERY]" Filtered="true" Alias="[SomeText:1]" IndexKind="SecondarySelectiveXML" Storage="RowStore" />
                                   <SeekPredicates>
                                     <SeekPredicateNew>
                                       <SeekKeys>
                                         <Prefix ScanType="EQ">
                                           <RangeColumns>
                                             <ColumnReference Database="[adventureworks]" Schema="[sys]" Table="[xml_sxi_table_1463676262_256000]" Alias="[SomeText:1]" Column="pathXQUERY_1_value" />
                                           </RangeColumns>
                                           <RangeExpressions>
                                             <ScalarOperator ScalarString="N''bbb''">
                                               <Const ConstValue="N''bbb''" />
                                             </ScalarOperator>
                                           </RangeExpressions>
                                         </Prefix>
                                       </SeekKeys>
                                     </SeekPredicateNew>
                                   </SeekPredicates>
                                 </IndexScan>
                               </RelOp>
                               <RelOp AvgRowSize="461" EstimateCPU="0.0001581" EstimateIO="0.003125" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row" EstimateRows="1" LogicalOp="Clustered Index Seek" NodeId="8" Parallel="false" PhysicalOp="Clustered Index Seek" EstimatedTotalSubtreeCost="0.0032831" TableCardinality="150002">
                                 <OutputList />
                                 <RunTimeInformation>
                                   <RunTimeCountersPerThread Thread="0" ActualRows="0" Batches="0" ActualEndOfScans="0" ActualExecutions="0" ActualExecutionMode="Row" ActualElapsedms="0" ActualCPUms="0" ActualScans="0" ActualLogicalReads="0" ActualPhysicalReads="0" ActualReadAheads="0" ActualLobLogicalReads="0" ActualLobPhysicalReads="0" ActualLobReadAheads="0" />
                                 </RunTimeInformation>
                                 <IndexScan Lookup="true" Ordered="true" ScanDirection="FORWARD" ForcedIndex="false" ForceSeek="false" ForceScan="false" NoExpandHint="false" Storage="RowStore">
                                   <DefinedValues />
                                   <Object Database="[adventureworks]" Schema="[sys]" Table="[xml_sxi_table_1463676262_256000]" Index="[SIX_T]" Alias="[SomeText:1]" TableReferenceId="-1" IndexKind="SelectiveXML" Storage="RowStore" />
                                   <SeekPredicates>
                                     <SeekPredicateNew>
                                       <SeekKeys>
                                         <Prefix ScanType="EQ">
                                           <RangeColumns>
                                             <ColumnReference Database="[adventureworks]" Schema="[sys]" Table="[xml_sxi_table_1463676262_256000]" Alias="[SomeText:1]" Column="pk1" />
                                             <ColumnReference Database="[adventureworks]" Schema="[sys]" Table="[xml_sxi_table_1463676262_256000]" Alias="[SomeText:1]" Column="row_id" />
                                           </RangeColumns>
                                           <RangeExpressions>
                                             <ScalarOperator ScalarString="[adventureworks].[sys].[xml_sxi_table_1463676262_256000].[pk1] as [SomeText:1].[pk1]">
                                               <Identifier>
                                                 <ColumnReference Database="[adventureworks]" Schema="[sys]" Table="[xml_sxi_table_1463676262_256000]" Alias="[SomeText:1]" Column="pk1" />
                                               </Identifier>
                                             </ScalarOperator>
                                             <ScalarOperator ScalarString="[adventureworks].[sys].[xml_sxi_table_1463676262_256000].[row_id] as [SomeText:1].[row_id]">
                                               <Identifier>
                                                 <ColumnReference Database="[adventureworks]" Schema="[sys]" Table="[xml_sxi_table_1463676262_256000]" Alias="[SomeText:1]" Column="row_id" />
                                               </Identifier>
                                             </ScalarOperator>
                                           </RangeExpressions>
                                         </Prefix>
                                       </SeekKeys>
                                     </SeekPredicateNew>
                                   </SeekPredicates>
                                   <Predicate>
                                     <ScalarOperator ScalarString="[adventureworks].[sys].[xml_sxi_table_1463676262_256000].[path_1_id] as [SomeText:1].[path_1_id] IS NOT NULL">
                                       <Logical Operation="IS NOT NULL">
                                         <ScalarOperator>
                                           <Identifier>
                                             <ColumnReference Database="[adventureworks]" Schema="[sys]" Table="[xml_sxi_table_1463676262_256000]" Alias="[SomeText:1]" Column="path_1_id" />
                                           </Identifier>
                                         </ScalarOperator>
                                       </Logical>
                                     </ScalarOperator>
                                   </Predicate>
                                 </IndexScan>
                               </RelOp>
                             </NestedLoops>
                           </RelOp>
                           <RelOp AvgRowSize="11" EstimateCPU="0.0001581" EstimateIO="0.003125" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row" EstimateRows="1" EstimatedRowsRead="1" LogicalOp="Clustered Index Seek" NodeId="9" Parallel="false" PhysicalOp="Clustered Index Seek" EstimatedTotalSubtreeCost="0.0032831" TableCardinality="200002">
                             <OutputList>
                               <ColumnReference Database="[adventureworks]" Schema="[dbo]" Table="[T]" Alias="[T]" Column="ID" />
                             </OutputList>
                             <RunTimeInformation>
                               <RunTimeCountersPerThread Thread="0" ActualRows="0" Batches="0" ActualEndOfScans="0" ActualExecutions="0" ActualExecutionMode="Row" ActualElapsedms="0" ActualCPUms="0" ActualScans="0" ActualLogicalReads="0" ActualPhysicalReads="0" ActualReadAheads="0" ActualLobLogicalReads="0" ActualLobPhysicalReads="0" ActualLobReadAheads="0" />
                             </RunTimeInformation>
                             <IndexScan Ordered="true" ScanDirection="FORWARD" ForcedIndex="false" ForceSeek="false" ForceScan="false" NoExpandHint="false" Storage="RowStore">
                               <DefinedValues>
                                 <DefinedValue>
                                   <ColumnReference Database="[adventureworks]" Schema="[dbo]" Table="[T]" Alias="[T]" Column="ID" />
                                 </DefinedValue>
                               </DefinedValues>
                               <Object Database="[adventureworks]" Schema="[dbo]" Table="[T]" Index="[PK__T__3214EC27EF7043C5]" Alias="[T]" IndexKind="Clustered" Storage="RowStore" />
                               <SeekPredicates>
                                 <SeekPredicateNew>
                                   <SeekKeys>
                                     <Prefix ScanType="EQ">
                                       <RangeColumns>
                                         <ColumnReference Database="[adventureworks]" Schema="[dbo]" Table="[T]" Alias="[T]" Column="ID" />
                                       </RangeColumns>
                                       <RangeExpressions>
                                         <ScalarOperator ScalarString="[adventureworks].[sys].[xml_sxi_table_1463676262_256000].[pk1] as [SomeText:1].[pk1]">
                                           <Identifier>
                                             <ColumnReference Database="[adventureworks]" Schema="[sys]" Table="[xml_sxi_table_1463676262_256000]" Alias="[SomeText:1]" Column="pk1" />
                                           </Identifier>
                                         </ScalarOperator>
                                       </RangeExpressions>
                                     </Prefix>
                                   </SeekKeys>
                                 </SeekPredicateNew>
                               </SeekPredicates>
                             </IndexScan>
                           </RelOp>
                         </NestedLoops>
                       </RelOp>
                     </StreamAggregate>
                   </RelOp>
                 </StreamAggregate>
               </RelOp>
             </ComputeScalar>
           </RelOp>
         </QueryPlan>
       </StmtSimple>
     </Statements>
   </Batch>
 </BatchSequence>
</ShowPlanXML>');

問題是,如何讓 xml 計數與按列計數一樣快?

根據您需要使用 xPath 表達式的靈活性,您可以使用屬性提升

create function dbo.GetSomeText(@X xml) returns varchar(8) with schemabinding as
begin
 return @X.value('(/SomeText/text())[1]', 'varchar(8)');
end;

go

alter table dbo.T add SomeTextColumn as dbo.GetSomeText(XMLDoc);

go

create index T_IX_SomeText on dbo.T(SomeTextColumn);

go

select count(T.ID)
from dbo.T as T
where T.SomeTextColumn = 'MoreText';

在此處輸入圖像描述

引用自:https://dba.stackexchange.com/questions/252047