Postgresql
如何使用函式避免重複
我在批量插入中使用此功能以避免重複 url 路徑。
CREATE OR REPLACE FUNCTION "univ"."gc_landing"(IN _name text, OUT landing_id int4) RETURNS "int4" AS $BODY$ DECLARE landing_name TEXT; BEGIN landing_name := _name; LOOP BEGIN WITH sel AS ( SELECT id FROM univ.landings WHERE name = landing_name ) , ins AS ( INSERT INTO univ.landings (name) SELECT landing_name WHERE NOT EXISTS (SELECT 1 FROM sel) RETURNING id ) SELECT id FROM sel NATURAL FULL OUTER JOIN ins INTO landing_id; EXCEPTION WHEN UNIQUE_VIOLATION THEN RAISE NOTICE 'It actually happened!'; END; EXIT WHEN landing_id IS NOT NULL; END LOOP; END$BODY$ LANGUAGE plpgsql;
路徑如下所示:
%2faccessories%2fliners.html%26sa%3du%26ei%3df1n2voo3fjk1acowgnao%26ved%3d0cesq9qewbg%26usg%3dafqjcnhvnzccoijbs0zvswwjxexhyd-7xw
我的桌子:
id | integer | not null default nextval('univ.seq_landings_id'::regclass) | plain name | text | not null | extended created_at| timestamp without time zone | not null default now() | plain Indexes: "landings_pkey" PRIMARY KEY, btree (id) "uniq_index_landings_on_name" UNIQUE, btree (name)
我有一個唯一的名稱索引,但該功能仍在重複。我怎樣才能確保不會有重複?
我已經重新測試了你的功能,它似乎像宣傳的那樣工作。
如果實際上輸入了重複值,您將得到一個回滾事務的異常。
該函式旨在擷取此類異常 (
UNIQUE_VIOLATION
) 並重試,這將找到(現在)現有行並返回現有的id
.您的“重複”報告似乎不可能。一定是哪裡有誤會。
我只有很小的改進可以提供:
將其設為 a
UNIQUE CONSTRAINT
,這比唯一索引更乾淨,同時對手頭的情況也這樣做。由於您已經有了唯一索引,因此可以使用特殊形式:ALTER TABLE univ.landings ADD CONSTRAINT landings_name_key UNIQUE USING INDEX index_name univ.uniq_index_landings_on_name;
您可以稍微簡化一下功能。您的附加變數
landing_name
只是浪費:CREATE OR REPLACE FUNCTION univ.gc_landing(_name text, OUT landing_id int) RETURNS int AS $func$ BEGIN LOOP BEGIN WITH sel AS ( SELECT id FROM univ.landings WHERE name = _name ) , ins AS ( INSERT INTO univ.landings (name) SELECT _name WHERE NOT EXISTS (SELECT 1 FROM sel) RETURNING id ) SELECT id FROM sel NATURAL FULL OUTER JOIN ins INTO landing_id; EXCEPTION WHEN UNIQUE_VIOLATION THEN RAISE NOTICE 'It actually happened!'; END; EXIT WHEN landing_id IS NOT NULL; END LOOP; END $func$ LANGUAGE plpgsql;
但請注意,此功能是為單個插入而設計的。對於您提到的批量操作,另一種方法可能更有效:在批量插入目標表之前折疊源中的重複項。