Sql-Server

SQL PATINDEX/排序規則問題/錯誤?

  • June 27, 2022

我有一個函式(幾年前我在這裡找到的),它使用 STUFF/PATINDEX 從字元串中去除非數字字元。在不區分大小寫的排序規則上執行時,它工作正常。最近我需要在區分大小寫的排序數據庫上使用它,並發現了一些奇怪的行為。如果 PATINDEX 的 %pattern% 只是指定小寫(例如-%

$$ ^a-z0-9_- $$%),則當排序規則為 Latin1_General_100_CS_AS 時,將刪除所有大寫 Z。如果排序規則是 SQL_Latin1_General_CP1_CS_AS,則刪除大寫字母 A。這是一個錯誤還是我錯過了什麼?

USE TestCollation
GO
PRINT '---------SourceString---------'
PRINT 'ABCDEFGHIJKLMNO_Z_A_PQRSTUVWXYZ-abcdefghijklmnopqrstuvwxyz'
GO
ALTER DATABASE TestCollation COLLATE SQL_Latin1_General_CP1_CI_AS;
GO
PRINT '---------SQL_Latin1_General_CP1_CI_AS---------'
GO

DECLARE @ExternalId VARCHAR(255) = 'ABCDEFGHIJKLMNO_Z_A_PQRSTUVWXYZ-abcdefghijklmnopqrstuvwxyz'
DECLARE @return VARCHAR(255)
SET @return = @ExternalId
DECLARE @KeepValues AS VARCHAR(50)
SET @KeepValues = '%[^a-z0-9_-]%'
WHILE PATINDEX ( @KeepValues, @return ) > 0
BEGIN
   SET @return = STUFF ( @return, PATINDEX ( @KeepValues, @return ), 1, '' )
END
PRINT @return

go

ALTER DATABASE TestCollation COLLATE Latin1_General_100_CS_AS;
GO
PRINT '---------Latin1_General_100_CS_AS---------'
GO
DECLARE @ExternalId VARCHAR(255) = 'ABCDEFGHIJKLMNO_Z_A_PQRSTUVWXYZ-abcdefghijklmnopqrstuvwxyz'
DECLARE @return VARCHAR(255)
SET @return = @ExternalId
DECLARE @KeepValues AS VARCHAR(50)
SET @KeepValues = '%[^a-z0-9_-]%'
WHILE PATINDEX ( @KeepValues, @return ) > 0
BEGIN
   SET @return = STUFF ( @return, PATINDEX ( @KeepValues, @return ), 1, '' )
END
PRINT @return

ALTER DATABASE TestCollation COLLATE SQL_Latin1_General_CP1_CS_AS;
GO
PRINT '---------SQL_Latin1_General_CP1_CS_AS---------'
GO
DECLARE @ExternalId VARCHAR(255) = 'ABCDEFGHIJKLMNO_Z_A_PQRSTUVWXYZ-abcdefghijklmnopqrstuvwxyz'
DECLARE @return VARCHAR(255)
SET @return = @ExternalId
DECLARE @KeepValues AS VARCHAR(50)
SET @KeepValues = '%[^a-z0-9_-]%'
WHILE PATINDEX ( @KeepValues, @return ) > 0
BEGIN
   SET @return = STUFF ( @return, PATINDEX ( @KeepValues, @return ), 1, '' )
END
PRINT @return

這不是錯誤。在進行區分大小寫的排序時,這只是大小寫優先的區別。雖然看起來似乎沒有進行任何排序,但兩者中使用的字元[...]範圍萬用字元,在某種意義上,在應用範圍時對字元進行排序,例如任何模式(在本例中為and )。所以,這兩個選項是:LIKE``PATINDEX``{character}-{character}``a-z``0-9

  1. A a BbCc…Z z ( az不包括A )
  2. a AbBcC… z Z ( az不包括Z )

SQL Server 排序規則(即名稱以 開頭的排序規則SQL_)大多使用一種方法,而 Windows 排序規則(即名稱以 開頭的排序規則SQL_)使用另一種方​​法。

為了說明行為:

SELECT * FROM (VALUES ('A'), ('a'), ('Z'), ('z')) tmp (val)
WHERE tmp.val LIKE '%[a-z]%' COLLATE SQL_Latin1_General_CP1_CS_AS
ORDER BY tmp.val COLLATE SQL_Latin1_General_CP1_CS_AS;
/*
a
Z
z
*/

SELECT * FROM (VALUES ('A'), ('a'), ('Z'), ('z')) tmp (val)
WHERE tmp.val LIKE '%[a-z]%' COLLATE Latin1_General_100_CS_AS
ORDER BY tmp.val COLLATE Latin1_General_100_CS_AS;
/*
a
A
z
*/

SELECT * FROM (VALUES ('A'), ('a'), ('Z'), ('z')) tmp (val)
WHERE tmp.val LIKE '%[a-z]%' COLLATE Latin1_General_100_BIN2
ORDER BY tmp.val COLLATE Latin1_General_100_BIN2;
/*
a
z
*/

僅供參考:您可以通過在兩個呼叫中強制排序規則來避免處理數據庫的預設排序規則PATINDEX

GO
DECLARE @ExternalId VARCHAR(255) =
   'ABCDEFGHIJKLMNO_Z_A_PQRSTUVWXYZ-abcdefghijklmnopqrstuvwxyz'
DECLARE @return VARCHAR(255)
SET @return = @ExternalId
DECLARE @KeepValues AS VARCHAR(50)
SET @KeepValues = '%[^a-z0-9_-]%'
WHILE PATINDEX ( @KeepValues COLLATE Latin1_General_100_CI_AS, @return ) > 0
BEGIN
   SET @return = STUFF ( @return,
                         PATINDEX ( @KeepValues COLLATE Latin1_General_100_CI_AS,
                                    @return ),
                         1,
                         '' )
END
PRINT @return
GO

或者,添加A-Z到模式中:

SET @KeepValues = '%[^a-zA-Z0-9_-]%'

引用自:https://dba.stackexchange.com/questions/313827