Oracle
刪除重複項時出現故障 oracle11g regexp_replace
我有一個逗號分隔的值列表(作為一個字元串),其中包含重複項,我想刪除它們。我嘗試過,
REGEXP_REPLACE
但對於某些值,它給了我一個錯誤的結果:/* CORRECT */ select '308RONBADB046782001,308RONBADB046782001,308RONCRT0046782001,308RONCRT0046782001,308RONMFRT046782001,308RONMFRT046782001' as str, regexp_replace('308RONBADB046782001,308RONBADB046782001,308RONCRT0046782001,308RONCRT0046782001,308RONMFRT046782001,308RONMFRT046782001', '([^,]+)(,*\1)+', '\1') as str_uq from dual union all /* FAULTY */ select '106RONBADB0W2562701,106RONCRT00W2562701,106RPB0130350001,106RONBADB0W2562701' as str, regexp_replace('106RONBADB0W2562701,106RONCRT00W2562701,106RPB0130350001,106RONBADB0W2562701', '([^,]+)(,*\1)+', '\1') as str_uq from dual
執行上面的範例,將為第一個查詢提供預期的結果,但對於第二個查詢,它只是連接所有值而不刪除重複項。我猜這些值開始的事實
1
與它有關。你能幫我解決正則表達式模式嗎?我找不到錯誤…
謝謝!
在您的第二個範例中,值是唯一的,但是您正在搜尋任何重複的匹配項並且有多個匹配項:
‘106RONBADB0W256270 1,1 06RONCRT 00 W256270 1,1 06RPB013035 0001,1 06RONBADB0W2562701’
它沒有連接,它正在刪除一些字元:-)
通常添加單詞邊界
\b
應該可以,但在 Oracle 中不行。我發現這是可行
'(.+?,)\1+'
的,但是您需要在字元串中添加最後一個逗號(小提琴)/* CORRECT */ select '308RONBADB046782001,308RONBADB046782001,308RONCRT0046782001,308RONCRT0046782001,308RONMFRT046782001,308RONMFRT046782001' as str, regexp_replace('308RONBADB046782001,308RONBADB046782001,308RONCRT0046782001,308RONCRT0046782001,308RONCRT0046782001,308RONMFRT046782001,308RONMFRT046782001' || ',', --'(\b[^,]+)(,*\1)+', --should work, doesn't in Oracle '(.+?,)\1+', '\1') as str_uq from dual union all /* FAULTY */ select '106RONBADB0W2562701,106RONCRT00W2562701,106RPB0130350001,106RONBADB0W2562701' as str, regexp_replace('106RONBADB0W2562701,106RONCRT00W2562701,106RPB0130350001,106RONBADB0W2562701' || ',', --'(\b[^,]+)(,*\1)+', --should work, doesn't in Oracle '(.+?,)\1+', '\1') as str_uq from dual
事實上,正如 dnoeth 所說,我似乎無法僅使用 REGEXP 來實現預期的行為,所以我建構了一個可以做到這一點的函式。我會把它貼在這裡,也許它對其他人也有用。
create or replace function str_remove_duplicates (str in varchar2, separator in varchar2) return varchar2 is regex_pattern varchar2(5); result varchar2(255); begin regex_pattern := '[^' || nvl(separator, ',') || ']+'; -- '[^,]+' select listagg(acc, separator) within group (order by acc) into result from ( select distinct acc from ( select regexp_substr(str, regex_pattern, 1, level) as acc from dual connect by regexp_substr(str, regex_pattern, 1, level) is not null ) ); return result; end str_remove_duplicates;
一切順利!
丹尼爾