Mysql

查找具有重複版本的組

  • December 28, 2020

我想在表中查找具有重複組版本的所有組。一個組可以有多個組版本。每個組版本可以有多個成員。組“版本”由grpid和定義changeDate如果一個組版本中的所有成員(和)與同一組中的另一個組版本匹配userid,則該組認為重複。pct``hobby

https://dbfiddle.uk/?rdbms=mysql_5.7&fiddle=84eb81a1a71dcee9ad3d0bd91f56120a

groups

*唯一組版本號僅用於視覺化。

結果應該是:

grpid
1
4
7

解釋:

  • grpid 1 - 有 3 個組版本(1 個成員) - 1 和 3 個重複,因為 userid、pct 和 hobby 相等
  • grpid 2 - 有 2 個組版本(2 個成員) - 不重複,因為 5 和 7 之間的愛好不相等
  • grpid 3 - 有 2 個組版本(2 個成員) - 不重複,因為 pct 在所有成員中都不同
  • grpid 4 - 有 2 個組版本(2 個成員) - 所有成員都是重複的,因為 userid、pct 和 userid 相等
  • grpid 5 - 只有一組 3 名成員 - 不重複
  • grpid 6 - 有 3 個組版本(3 個成員) - 不重複 - 版本之間組中每個成員的 pct 更改
  • grpid 7 - 有 3 個組版本(3 個成員) - 重複,因為 userid、pct 和 userid 在 28-30 和 31-33 之間相等
  • grpid 8 - 有 2 個組版本,一個有一個成員,一個有 2 個成員 - 不重複,因為該組中有另一個成員

我正在使用 MySQL 5.7。

希望這應該提供一個答案:

這實現了“通用”關係劃分:

SELECT DISTINCT grpid
FROM groups AS g
GROUP BY grpid, changeDate
HAVING NOT EXISTS
   ( SELECT 1
     FROM groups AS gi
     WHERE gi.grpid = g.grpid
       AND gi.changeDate = g.changeDate
       AND NOT EXISTS
           ( SELECT 1 
              FROM groups AS gk
              WHERE gk.grpid = gi.grpid
                AND gk.changeDate <> gi.changeDate
                AND gk.userid = gi.userid
                AND gk.pct = gi.pct
                AND gk.hobby = gi.hobby
           )
   ) ;

結果:

grid
  1
  4
  7

這實現了精確的關係除法,這導致了更複雜的程式碼。您的情況下,分組/版本由兩列確定,(grpid, changeDate)這使它看起來更加複雜。

在**dbffdle.uk中測試**

查詢一:

SELECT DISTINCT g1.grpid
FROM 
 ( SELECT grpid, changeDate
   FROM groups AS g
   GROUP BY grpid, changeDate
 ) AS g1
 JOIN
 ( SELECT grpid, changeDate
   FROM groups AS g
   GROUP BY grpid, changeDate
 ) AS g2
 ON  g1.grpid = g2.grpid
 AND g1.changeDate < g2.changeDate
WHERE NOT EXISTS
   ( SELECT 1
     FROM groups AS gi
     WHERE gi.grpid = g1.grpid
       AND gi.changeDate = g1.changeDate
       AND NOT EXISTS
           ( SELECT 1 
              FROM groups AS gk
              WHERE gk.grpid = g2.grpid
                AND gk.changeDate = g2.changeDate
                AND gk.userid = gi.userid
                AND gk.pct = gi.pct
                AND gk.hobby = gi.hobby
           )
   )
  AND NOT EXISTS
   ( SELECT 1
     FROM groups AS gi
     WHERE gi.grpid = g2.grpid
       AND gi.changeDate = g2.changeDate
       AND NOT EXISTS
           ( SELECT 1 
              FROM groups AS gk
              WHERE gk.grpid = g1.grpid
                AND gk.changeDate = g1.changeDate
                AND gk.userid = gi.userid
                AND gk.pct = gi.pct
                AND gk.hobby = gi.hobby
           )
   )
;

使用 MySQLGROUP_CONCAT函式的查詢 2:

SELECT DISTINCT grpid
FROM
 ( SELECT grpid, changeDate, 
          GROUP_CONCAT( CONCAT_WS('-', userid, hobby, pct)
                        ORDER BY userid, hobby, pct
                        SEPARATOR '  '
                      ) AS groupdata 
   FROM groups AS gr
   GROUP BY grpid, changeDate
 ) AS g
GROUP BY grpid, groupdata
HAVING COUNT(*) > 1 ;

引用自:https://dba.stackexchange.com/questions/281995