Join
跨一對多關係表查找重複記錄
假設我有兩個具有一對多關係的表:人員和屬性。我正在嘗試根據 first_name、last_name 查找重複項,並且所有屬性必須完全匹配。
CREATE TABLE People (Id int, first_name varchar(100), last_name varchar(100)); CREATE TABLE Attributes (Id int, person_id int, field varchar(100), field_value varchar(100)); INSERT INTO People VALUES (1, 'John', 'Smith'); INSERT INTO People VALUES (2, 'John', 'Smith'); INSERT INTO People VALUES (3, 'John', 'Smith'); INSERT INTO Attributes VALUES (1, 1, 'HairColor', 'Brown'); INSERT INTO Attributes VALUES (2, 1, 'EyeColor', 'Blue'); INSERT INTO Attributes VALUES (3, 2, 'HairColor', 'Brown'); INSERT INTO Attributes VALUES (4, 2, 'EyeColor', 'Blue'); INSERT INTO Attributes VALUES (5, 3, 'HairColor', 'Blonde');
這給了我們:
id | first_name | last_name ----+------------+----------- 1 | John | Smith 2 | John | Smith 3 | John | Smith id | person_id | field | field_value ----+-----------+-----------+------------- 1 | 1 | HairColor | Brown 2 | 1 | EyeColor | Blue 3 | 2 | HairColor | Brown 4 | 2 | EyeColor | Blue 5 | 3 | HairColor | Blonde
我想要一個從 People 表中返回 ID 1 和 2 的查詢。我可以在一個表中找到重複項:
select first_name,last_name,count(*) from People group by first_name,last_name having ( count(*) > 1 );
但是我無法加入一對多表並檢測兩個表中的重複項。如何檢測具有一對多關係的表中的重複項?
一種方法(檢查SQLfiddle):
select p1.id as id1, p2.id as id2 from people p1 join people p2 on p1.first_name = p2.first_name and p1.last_name = p2.last_name and p1.id < p2.id where not exists ( select 1 from ( select * from attributes a1 where a1.person_id = p1.id union all select * from attributes a2 where a2.person_id = p2.id ) g group by field, field_value having count(*) <> 2 ) ;
另一個:
select p1.id as id1, p2.id as id2 from people p1 join people p2 on p1.first_name = p2.first_name and p1.last_name = p2.last_name and p1.id < p2.id where not exists ( ( select field, field_value from attributes a1 where a1.person_id = p1.id union select field, field_value from attributes a2 where a2.person_id = p2.id ) except ( select field, field_value from attributes a1 where a1.person_id = p1.id intersect select field, field_value from attributes a2 where a2.person_id = p2.id ) ) ;
Intersect 的優先級高於 except/minus,至少在 Postgres 和 SQL Server 中是這樣。為了安全起見,您可以使用括號來確保優先級。