Mysql
在MySQL中將單列與多個值匹配而無需自連接表
我們有一個表格用來儲存問題的答案。我們需要能夠找到對特定問題有特定答案的使用者。因此,如果我們的表包含以下數據:
user_id question_id answer_value Sally 1 Pooch Sally 2 Peach John 1 Pooch John 2 Duke
我們想找到對問題 1 回答“Pooch”,對問題 2 回答“Peach”的使用者,以下 SQL 將(顯然)不起作用:
select user_id from answers where question_id=1 and answer_value = 'Pooch' and question_id=2 and answer_value='Peach'
我的第一個想法是為我們正在尋找的每個答案自行加入表格:
select a.user_id from answers a, answers b where a.user_id = b.user_id and a.question_id=1 and a.answer_value = 'Pooch' and b.question_id=2 and b.answer_value='Peach'
這可行,但由於我們允許任意數量的搜尋過濾器,我們需要找到更有效的東西。我的下一個解決方案是這樣的:
select user_id, count(question_id) from answers where ( (question_id=2 and answer_value = 'Peach') or (question_id=1 and answer_value = 'Pooch') ) group by user_id having count(question_id)>1
但是,我們希望使用者能夠兩次填寫同一份問卷,這樣他們就有可能在答案表中對問題 1 有兩個答案。
所以,現在我很茫然。解決這個問題的最佳方法是什麼?謝謝!
我找到了一種無需自聯接即可執行此查詢的巧妙方法。
我在 MySQL 5.5.8 for Windows 中執行了這些命令,得到了以下結果:
use test DROP TABLE IF EXISTS answers; CREATE TABLE answers (user_id VARCHAR(10),question_id INT,answer_value VARCHAR(20)); INSERT INTO answers VALUES ('Sally',1,'Pouch'), ('Sally',2,'Peach'), ('John',1,'Pooch'), ('John',2,'Duke'); INSERT INTO answers VALUES ('Sally',1,'Pooch'), ('Sally',2,'Peach'), ('John',1,'Pooch'), ('John',2,'Duck'); SELECT user_id,question_id,GROUP_CONCAT(DISTINCT answer_value) given_answers FROM answers GROUP BY user_id,question_id; +---------+-------------+---------------+ | user_id | question_id | given_answers | +---------+-------------+---------------+ | John | 1 | Pooch | | John | 2 | Duke,Duck | | Sally | 1 | Pouch,Pooch | | Sally | 2 | Peach | +---------+-------------+---------------+
該顯示顯示約翰對問題 2 給出了兩個不同的答案,而莎莉對問題 1 給出了兩個不同的答案。
要擷取所有使用者對哪些問題的回答不同,只需將上述查詢放在子查詢中並檢查給定答案列表中的逗號以獲得不同答案的計數,如下所示:
SELECT user_id,question_id,given_answers, (LENGTH(given_answers) - LENGTH(REPLACE(given_answers,',','')))+1 multianswer_count FROM (SELECT user_id,question_id,GROUP_CONCAT(DISTINCT answer_value) given_answers FROM answers GROUP BY user_id,question_id) A;
我懂了:
+---------+-------------+---------------+-------------------+ | user_id | question_id | given_answers | multianswer_count | +---------+-------------+---------------+-------------------+ | John | 1 | Pooch | 1 | | John | 2 | Duke,Duck | 2 | | Sally | 1 | Pouch,Pooch | 2 | | Sally | 2 | Peach | 1 | +---------+-------------+---------------+-------------------+
現在只需使用另一個子查詢過濾掉 multianswer_count = 1 的行:
SELECT * FROM (SELECT user_id,question_id,given_answers, (LENGTH(given_answers) - LENGTH(REPLACE(given_answers,',','')))+1 multianswer_count FROM (SELECT user_id,question_id,GROUP_CONCAT(DISTINCT answer_value) given_answers FROM answers GROUP BY user_id,question_id) A) AA WHERE multianswer_count > 1;
這就是我得到的:
+---------+-------------+---------------+-------------------+ | user_id | question_id | given_answers | multianswer_count | +---------+-------------+---------------+-------------------+ | John | 2 | Duke,Duck | 2 | | Sally | 1 | Pouch,Pooch | 2 | +---------+-------------+---------------+-------------------+
本質上,我執行了三個表掃描:1 次在主表上,2 次在小子查詢上。沒有加入!
試一試 !!!
我喜歡加入方法,我自己:
SELECT a.user_id FROM answers a INNER JOIN answers a1 ON a1.question_id=1 AND a1.answer_value='Pooch' INNER JOIN answers a2 ON a2.question_id=2 AND a2.answer_value='Peach' GROUP BY a.user_id
更新
OR
在使用更大的表(約 100 萬行)進行測試後,此方法比原始問題中提到 的簡單方法花費的時間要長得多。