Sql-Server
SQL 將地址行拆分為單獨的列
將逗號列表列拆分為不同列的最佳方法是什麼?我正在尋找一個簡單的算法,聽說最好避免標量值函式,因為它們可能很慢。搜尋到許多方法,substring,find,splitstring;我們需要在數百萬個地址上執行該算法,因此通過良好的編碼實踐尋找最佳、最優的答案。
create table dbo.AddressTest (AddressLine varchar(255)) insert into dbo.AddressTest (AddressLine) values ('123 Maple Street, Austin, Texas, 78653, 555-234-4589') -- may not require substring, just looking for good way select substring as AddressStreet -- Expected: 123 Maple Street substring(...) as City, -- Expected: Austin substring(...) as State, -- Expected: Texas substring(...) as ZipCode -- Expected: 78653 substring(...) as PhoneNumber -- Expected: 555-234-4589 from dbo.AddressTest
也在研究這種方法: https ://stackoverflow.com/questions/19449492/using-t-sql-return-nth-delimited-element-from-a-string
STRING_SPLIT()在 SQL Server 2016 中可能是最好的。我已經測試了 100 萬行並在 12 秒內返回結果(相當人為的測試)。
您可以使用PIVOT運算符從拆分後的行中生成列,因為存在已知數量的元素。
設置程式碼
CREATE TABLE #Table (ID INT IDENTITY, StreetAddress VARCHAR(MAX)) DECLARE @I INT = 1000000 WHILE (@I) > 0 BEGIN INSERT INTO #Table (StreetAddress) VALUES ('123 Fake St, BigCity, My State, 12345') SET @I = @I -1 END
檢索拆分地址元素的程式碼
SELECT ID, [1] AS Street, [2] AS City, [3] AS State, [4] AS Code FROM ( SELECT ID, value, ROW_NUMBER() OVER (PARTITION BY ID ORDER BY ID) AS Rn FROM #Table CROSS APPLY STRING_SPLIT(REPLACE(StreetAddress, ', ', ','), ',') ) src PIVOT ( MAX(Value) FOR Rn IN ([1], [2], [3], [4]) ) pvt
您可能可以通過消除 STRING_SPLIT 中的 REPLACE 來提高性能,但隨後您需要修剪列以刪除空格。