Mysql

每行 MySQL 相關子查詢(多列)

  • February 1, 2022

MySQL 文件指出:

FROM 子句中的子查詢不能是相關子查詢。它們在查詢執行期間被整體具體化(評估以產生結果集),因此不能按外部查詢的每行評估它們。

好吧,好吧,我明白為什麼會這樣了。但是,如果您想對多個值重複特定查詢怎麼辦?

SELECT COUNT( returning_visitors.per_ip ) AS count, AVG( returning_visitors.per_ip ) AS num_visits
FROM (
   SELECT COUNT( * ) AS per_ip
   FROM site_hits_unique
   WHERE site_hits_unique.site_id = ___INPUT___
   AND site_hits_unique.date >= CURDATE( ) - INTERVAL 30 DAY
   GROUP BY site_hits_unique.site_id, site_hits_unique.ip
   HAVING per_ip >1
) AS returning_visitors

假設我想使用來自SELECT id FROM sites. 有沒有辦法在不使用儲存過程的情況下在單個查詢中執行此操作?就像是:

SELECT id, COUNT( returning_visitors.per_ip ) as readers, AVG( returning_visitors.per_ip ) as avg_visits_pr
FROM sites
SUBQUERY-PER-ROW (
   SELECT COUNT( * ) AS per_ip
   FROM site_hits_unique
   WHERE site_hits_unique.site_id = sites.id
   AND site_hits_unique.date >= CURDATE( ) - INTERVAL 30 DAY
   GROUP BY site_hits_unique.site_id, site_hits_unique.ip
   HAVING per_ip > 1
) AS returning_visitors
WHERE sites.id IN (162888, 42705, 11412)

SUBQUERY-PER-ROW這是虛擬碼。JOIN由於上述引用的限制,將無法正常工作。

編輯:

另一次嘗試使用來自http://dev.mysql.com/doc/refman/5.7/en/row-subqueries.html的資訊的虛擬碼:

SELECT id, (
   SELECT COUNT( per_ip ) AS count, AVG( per_ip ) AS num_visits
   FROM (
       SELECT COUNT( * ) AS per_ip
       FROM site_hits_unique
       WHERE site_hits_unique.site_id = sites.id
       AND site_hits_unique.date >= CURDATE( ) - INTERVAL 30 DAY
       GROUP BY site_hits_unique.site_id, site_hits_unique.ip
       HAVING per_ip >1
   ) AS returning_visitors
) as (readers, avg_visits_pr)
FROM sites
WHERE sites.id IN (162888, 42705, 11412)

(它不起作用,這不是實際有效的 MySQL 語法,只是一個虛擬碼範例)

這是你想要的?

SELECT  s.id,
       COUNT( rv.per_ip ) as readers,
       AVG( rv.per_ip ) as avg_visits_pr
   FROM  sites AS s
   JOIN  
   (
       SELECT  u.site_id,
               COUNT( * ) AS per_ip
           FROM  site_hits_unique AS u
           WHERE  u.date >= CURDATE( ) - INTERVAL 30 DAY
           GROUP BY  u.site_id, u.ip
           HAVING  per_ip > 1 
   )  AS rv ON rv.site_id = s.id
   WHERE  s.id IN (162888, 42705, 11412)

索引:

site_hits_unique: INDEX(site_id, date, ip)

或者也許是這樣:

SELECT  site_id,
       COUNT(DISTINCT ip) as readers,
       COUNT(*) / COUNT(DISTINCT ip) as avg_visits_pr
   FROM  
   (
       SELECT  site_id, ip, COUNT(*) AS per_ip
           FROM  site_hits_unique
           WHERE  date >= CURDATE( ) - INTERVAL 30 DAY
             AND  site_id IN (162888, 42705, 11412)
           GROUP BY  site_id, ip
           HAVING  per_ip > 1 
   ) AS rv
   GROUP BY  site_id;

所以,我發現了一種使用儲存過程來完成我想要完成的事情的方法。我希望這不是唯一的解決方案,並且有一種方法可以通過單個查詢來做到這一點,但這就是我所擁有的。

在單個 site_id 上獲得返回讀者的儲存過程:

DELIMITER $$

CREATE PROCEDURE get_site_readers(
   IN input_site_id INT,
   OUT readers INT,
   OUT avg_visits_pr DOUBLE)
BEGIN

   SELECT COUNT( per_ip ), AVG( per_ip )
   INTO readers, avg_visits_pr
   FROM (
       SELECT COUNT( * ) AS per_ip
       FROM site_hits_unique
       WHERE site_id = input_site_id
       AND date >= CURDATE( ) - INTERVAL 30 DAY
       GROUP BY site_id, ip
       HAVING per_ip > 1
   ) AS returning_visitors;

END

在每個 sites.id 上執行 get_site_readers(…) 的儲存過程:

DELIMITER $$

CREATE PROCEDURE update_site_readers() BEGIN
   DECLARE done BOOLEAN DEFAULT FALSE;
   DECLARE _id BIGINT UNSIGNED;
   DECLARE cur CURSOR FOR SELECT id FROM sites;
   DECLARE CONTINUE HANDLER FOR NOT FOUND SET done := TRUE;

   OPEN cur;

   eachSite: LOOP
       FETCH cur INTO _id;
       IF done THEN
           LEAVE eachSite;
       END IF;
       CALL get_site_readers(_id, @readers, @avg_visits_pr);
       UPDATE sites SET readers = @readers, avg_visits_pr = @avg_visits_pr WHERE id=_id;
   END LOOP eachSite;

   CLOSE cur;
END

注意查詢UPDATE sites SET readers = @readers, avg_visits_pr = @avg_visits_pr WHERE id=_id;- 這是我從一開始的意圖,用聚合數據更新站點表。但是,這並不能完全解決定義的問題,因為我最初是在詢問是否執行SELECT.

最後,呼叫儲存過程:

CALL update_site_readers();

引用自:https://dba.stackexchange.com/questions/148764