Sql-Server
在沒有函式的情況下以相同的順序拆分兩個分隔的字元串
我正在嘗試將帶有分隔字元串的兩列拆分為行。每個字元串中值的位置是相關的,因此我試圖將其拆分,以便相關值在一行中。我無法使用函式,因為我無法在數據庫中創建對象
這是範例表和數據
CREATE TABLE #temp (id INT, keys VARCHAR(50), vals VARCHAR(50) ); INSERT INTO #temp VALUES (1, '1,2,3', 'one,two,three'), (2, '4,5,6', 'four,five,six'), (3, '7,8,9', 'seven,eight,nine');
我想要的輸出是
ID key val 1 1 one 1 2 two 1 3 three 2 4 four 2 5 five 2 6 six 3 7 seven 3 8 eight 3 9 nine
如果我只拆分一列,我的查詢就可以工作,所以我用 row_number 定義了兩個 CTE,並在 ID 和 row_number 上加入。這確實提供了所需的輸出,但我的實時表非常大,我希望有一種方法可以只通過表一次,而不是兩次。
with keys as( SELECT id,keys,vals, keys.keyid.value('.', 'VARCHAR(8000)') AS keyid, row_number() over(order by (select null)) as rn FROM (SELECT id,keys,vals, CAST('<Keys><key>'+REPLACE(keys, ',', '</key><key>')+'</key></Keys>' AS XML) AS tempkeys FROM #temp ) AS temp CROSS APPLY tempkeys.nodes('/Keys/key') AS keys(keyid)), vals as( SELECT id,keys,vals, vals.val.value('.', 'VARCHAR(8000)') AS valid, row_number() over(order by (select null)) as rn FROM (SELECT id,keys,vals, CAST('<vals><val>'+REPLACE(vals, ',', '</val><val>')+'</val></vals>' AS XML) AS tempvals FROM #temp ) AS temp CROSS APPLY tempvals.nodes('/vals/val') AS vals(val)) SELECT k.id, k.keyid, v.valid FROM keys AS k INNER JOIN vals AS v ON k.id = v.id AND k.rn = v.rn;
msdb
在其他地方或其他地方創建函式。CREATE FUNCTION dbo.SplitTwoStringsWithSameOrder ( @List1 varchar(50), @List2 varchar(50), @Delim varchar(10) ) RETURNS TABLE AS RETURN ( WITH src(r) AS ( SELECT 1 UNION ALL SELECT r + 1 FROM src WHERE r < 10 ), Numbers(Number) AS ( SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM src AS s1, src AS s2 -- add more if you need longer strings ), parsed(s1,s2,r1,r2) AS ( SELECT SUBSTRING(@List1, n1.Number, CHARINDEX(@Delim, @List1 + @Delim, n1.Number) - n1.Number), SUBSTRING(@List2, n2.Number, CHARINDEX(@Delim, @List2 + @Delim, n2.Number) - n2.Number), r1 = ROW_NUMBER() OVER (ORDER BY n1.Number), r2 = ROW_NUMBER() OVER (ORDER BY n2.Number) FROM Numbers AS n1, Numbers AS n2 ON n1.Number <= LEN(@List1) AND n2.Number <= LEN(@List2) AND SUBSTRING(@Delim + @List1, n1.Number, LEN(@Delim)) = @Delim AND SUBSTRING(@Delim + @List2, n2.Number, LEN(@Delim)) = @Delim ) SELECT s1, s2, r1, r2 FROM parsed WHERE r1 = r2 );
然後,正如@gbn 所指出的,在您的查詢必須執行的任何地方通過 3 部分名稱引用它。
CREATE TABLE #temp (id INT, keys VARCHAR(50), vals VARCHAR(50) ); INSERT INTO #temp VALUES (1, '1,2,3', 'one,two,three'), (2, '4,5,6', 'four,five,six'), (3, '7,8,9', 'seven,eight,nine'); SELECT t.id, f.s1, f.s2 FROM #temp AS t CROSS APPLY msdb.dbo.SplitTwoStringsWithSameOrder(keys, vals, ',') AS f ORDER BY t.id, f.r1; GO DROP TABLE #temp;
結果:
計劃資源管理器中顯示的最終計劃*(免責聲明:我是產品經理)*不是我見過的最漂亮的東西(點擊放大一點):
但是只有一次掃描#temp(4% 的成本)。最大的成本是兩種和一個線軸,並且由於工作台而存在一些 I/O,我不確定這是否可以避免。
如果您知道這些字元串中的任何一個都只有 50 個字元,那麼您可以使用內置
Numbers
表格獲得一個更簡單的計劃(人們反對這些,但它們非常有用,而且它們幾乎總是在記憶體中如果你足夠引用它們)。這對 I/O 沒有幫助,但刪除遞歸 CTE 和其他在函式內建構數字的結構對 CPU 等非常有幫助。一、數字表:
DROP TABLE dbo.Numbers; ;WITH n AS ( SELECT TOP (50) rn = ROW_NUMBER() OVER (ORDER BY [object_id]) FROM sys.all_columns ORDER BY [object_id] ) SELECT [Number] = rn - 1 INTO dbo.Numbers FROM n; CREATE UNIQUE CLUSTERED INDEX n ON dbo.Numbers([Number]);
然後是函式的第二個版本:
CREATE FUNCTION dbo.SplitTwoStringsWithSameOrder2 ( @List1 varchar(50), @List2 varchar(50), @Delim nvarchar(10) ) RETURNS TABLE AS RETURN ( WITH parsed(s1,s2,r1,r2) AS ( SELECT SUBSTRING(@List1, n1.Number, CHARINDEX(@Delim, @List1 + @Delim, n1.Number) - n1.Number), SUBSTRING(@List2, n2.Number, CHARINDEX(@Delim, @List2 + @Delim, n2.Number) - n2.Number), r1 = ROW_NUMBER() OVER (ORDER BY n1.Number), r2 = ROW_NUMBER() OVER (ORDER BY n2.Number) FROM dbo.Numbers AS n1 INNER JOIN dbo.Numbers AS n2 ON n1.Number <= LEN(@List1) AND n2.Number <= LEN(@List2) AND SUBSTRING(@Delim + @List1, n1.Number, LEN(@Delim)) = @Delim AND SUBSTRING(@Delim + @List2, n2.Number, LEN(@Delim)) = @Delim ) SELECT s1, s2, r1, r2 FROM parsed WHERE r1 = r2 ); GO
這是產生的更簡單的計劃(再次點擊放大):
該計劃仍然有兩個排序操作,但線軸已經消失,仍然只有一次掃描
#temp
,並且在我的有限測試中,成本數字(絕對成本數字,而不是 %)每次都更好。我不確切知道其中任何一個都會擴展更多行,但值得測試,如果您將其與其他解決方案進行權衡並且它不能很好地擴展,那麼您可能需要重新考慮設計(儲存這些關係而不是逗號分隔的集合)。