為什麼 Postgres ORDER BY 似乎中途忽略了前導下劃線？

July 6, 2019

我有一個animal帶有的表name varchar(255)，並且添加了具有以下值的行：
Piranha
__Starts With 2
Rhino
Starts With 1
0_Zebra
_Starts With 1
Antelope
_Starts With 1
當我執行此查詢時：
zoology=# SELECT name FROM animal ORDER BY name;
     name       
-----------------
0_Zebra
Antelope
Piranha
Rhino
_Starts With 1
_Starts With 1
Starts With 1
__Starts With 2
(8 rows)
請注意行是如何按順序排序的，這意味著前導_用於將行放在_Starts With 1行之前Starts，但是__in__Starts With 2似乎忽略了這一事實，就好像2末尾的 the 比前兩個字元更重要一樣。
為什麼是這樣？
如果我用 Python 排序，結果是：
In  [2]: for animal in sorted(animals):
  ....:     print animal
  ....:     
0_Zebra
Antelope
Piranha
Rhino
Starts With 1
_Starts With 1
_Starts With 1
__Starts With 2
此外，Python 排序建議下劃線出現在字母之後，這表明 Postgres 對_Starts行前前兩行的排序Starts不正確。
注意：我使用的是 Postgres 9.1.15
這是我尋找排序規則的嘗試：
zoology=# select datname, datcollate from pg_database;
 datname  | datcollate  
-----------+-------------
template0 | en_US.UTF-8
postgres  | en_US.UTF-8
template1 | en_US.UTF-8
zoology   | en_US.UTF-8
(4 rows)
和：
zoology=# select table_schema, 
   table_name, 
   column_name,
   collation_name
from information_schema.columns
where collation_name is not null
order by table_schema,
   table_name,
   ordinal_position;
table_schema | table_name | column_name | collation_name 
--------------+------------+-------------+----------------
(0 rows)

由於您沒有為相關列定義不同的排序規則，因此它使用數據庫範圍的排序規則，en_US.UTF8就像在我的測試框中一樣。我觀察到完全相同的行為，把它當作一種安慰:)
我們看到的顯然是變數 collation elements的情況。根據字元和排序規則，許多不同的行為是可能的。這裡的下劃線（以及連字元和其他一些）僅用於打破平局 - ‘a’ 和 ‘_a’ 在第一輪中是等價的，然後通過考慮下劃線來解決它們之間的平局。
如果您想忽略下劃線（以及我的範例中的連字元、問號和驚嘆號）進行排序，您可以在表達式上定義排序：
SELECT * 
FROM (VALUES ('a'), 
            ('b1'), 
            ('_a'), 
            ('-a'), 
            ('?a'), 
            ('!a1'), 
            ('a2')
    ) t (val) 
ORDER BY translate(val, '_-?!', '');
在我的實驗中，向列表中添加新值通常會改變其他相等項目之間的順序，表明它們被視為真正平等。

引用自：https://dba.stackexchange.com/questions/115364

為什麼 Postgres ORDER BY 似乎中途忽略了前導下劃線？

相關問答

如何按典型的軟體版本（如 XYZ）訂購？

varchar 按其數字欄位排序

按字母排序，然後按數字排序

SQL order by query 以任意（但可重現）的方式產生結果

UNION ALL 子句的結果是否總是按順序附加？

Postgres 按多列排序