Mariadb

獲取特定列值多次出現的所有行(過濾掉單個出現)

  • May 17, 2021

我正在嘗試sessionId從現有結果集中過濾所有出現一次的 s。

此查詢正在 Web 應用程序中使用並在大型數據集(約 3500 萬行)上執行,因此我想防止此處出現子查詢。

我試過這個,它提供了一個過濾的結果,除了我現在每個 sessionId 只得到一行(我想要每個requestand response):

CREATE TABLE `api_log` (
 `id` varchar(50) NOT NULL,
 `clientId` varchar(100) DEFAULT NULL,
 `inserted` int(11) DEFAULT NULL,
 `sessionId` mediumtext DEFAULT NULL,
 `stage` varchar(120) DEFAULT NULL,
 `request` longtext CHARACTER SET utf8mb4 DEFAULT NULL,
 `response` longtext CHARACTER SET utf8mb4 DEFAULT NULL,
 PRIMARY KEY (`id`)
);

INSERT INTO api_log
 VALUES
   ("1", "abc", 1621008484, "session1", "production", '{"key":"value"}', '{"key":"value"}'),
   ("2", "abc", 1621008494, "session2", "production", '{"key":"value"}', '{"key":"value"}'),
   ("3", "abc", 1621008584, "session1", "production", '{"key":"value"}', '{"key":"value"}'),
   ("4", "abc", 1621008684, "session2", "production", '{"key":"value"}', '{"key":"value"}'),
   ("5", "abc", 1621008784, "session3", "production", '{"key":"value"}', '{"key":"value"}'),
   ("6", "abc", 1621008884, "session4", "production", '{"key":"value"}', '{"key":"value"}'),
   ("7", "abc", 1621008984, "session5", "production", '{"key":"value"}', '{"key":"value"}'),
   ("8", "abc", 1621009084, "session6", "production", '{"key":"value"}', '{"key":"value"}'),
   ("9", "abc", 1621009184, "session7", "production", '{"key":"value"}', '{"key":"value"}'),
   ("10", "abc", 1621009284, "session8", "production", '{"key":"value"}', '{"key":"value"}');
SELECT
   `clientId`,
   `sessionId`,
   `inserted`,
   `stage`,
   `request`,
   `response`
FROM
   `api_log`
WHERE
   (stage = 'production') 
   AND (clientId = 'abc') 
   AND (
       `inserted` BETWEEN 1621008482 AND 1621009285
   )
GROUP BY
   `clientId`,
   `stage`,
   `sessionId`
HAVING
   COUNT(sessionId) > 1

是否有任何技巧可以獲取 asessionId多次出現的所有行?

在這種情況下,我得到兩行,一行 forsession1和一行 for session2,但我又錯過了兩行,因為提到的兩個sessionIds 都有一個應該匹配的附加行。

SQL小提琴:http ://sqlfiddle.com/#!9/f87bb54/3

如果我正確理解您的要求,這就是視窗聚合大放異彩的問題。在過濾的數據集上使用函式的視窗版本COUNT(*)來獲取其他列旁邊的計數。然後過濾計數結果以僅獲取所需的行。您的輸出可以包括您的表中的任何或所有列:

SELECT
 id
, clientId
, inserted
, sessionId
, stage
, request
, response
FROM
 (
   SELECT
     *
   , COUNT(*) OVER (PARTITION BY sessionId) AS sessionIdCounter
   FROM
     api_log
   WHERE (stage = 'production') 
     AND (clientId = 'abc') 
     AND (inserted BETWEEN 1621008482 AND 1621009285)
 ) AS derived
WHERE
 sessionIdCounter > 1
ORDER BY
 inserted ASC
;

您可以在 dbfiddle.uk 上使用此解決方案:

引用自:https://dba.stackexchange.com/questions/291569