查詢突然停止工作 - 簡單的左連接但看不到什麼問題
我有一個查詢,它是兩個表之間的簡單左連接,
IS NULL
包含在 where 子句中,因為我需要顯示的左表的所有行,即使它在右表中給出了空值。這很有效,因為我讓它在我的 php 程式碼中工作,並且我的網站正在顯示它需要的內容。我已經一個多星期沒看這個了,今天回去發現它現在突然不工作了,即使我沒有碰過它。
我在這裡用我的確切程式碼和表格創建了一個 db fiddle - https://dbfiddle.uk/?rdbms=mariadb_10.4&fiddle=2effc82390641ce513806252700fd25c
我想顯示 - 左表 (level_quiz) 中的所有行和右表 (student_points) 中的所有行,其中 student_no = 40204123 或有 NULL 行
任何人都可以看看這個,看看為什麼它沒有顯示左表的額外行?(右表會有 NULL 值)
這將不勝感激。
您必須在加入前選擇學生。
也使用別名,這樣你就可以少輸入
SELECT * FROM level_quiz LEFT JOIN (SELECT * FROM student_points WHERE student_no = 40204123) sp ON level_quiz.id = sp.level_id
編號 | 級別標題 | quiz_desc | 整體任務 | 學生號 | level_id | 積分 | 時間戳 -: | :--------------------- | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :---------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------: | -------: | -----: | :------------------ 1 | 套裝 | 此挑戰的目的是幫助您完全熟悉如何編寫特定類型的集合,以及如何對實際集合執行標準操作。| 假設我們有三組數 A、B、C,定義如下: A = { 1, 10, 7, 3, 5, 2 };B = { 5, 8, 6, 7, 4 };C = { 7, 1, 8 } | 40204123 | 1 | 80 | 2021-01-12 15:37:11 2 | 序列 | 序列描述 | 整體問題2 | 40204123 | 2 | 75 | 2021-01-12 15:38:06 3 | 命題邏輯 | 邏輯描述 | 整體任務邏輯 | 40204123 | 3 | 30 | 2021-01-13 22:13:13 4 | 謂詞邏輯 - 集合 | 謂詞 desc 1 | 前提任務 1 | *空*| *空*| *空*| *空值*
db<>在這裡擺弄
您的程式碼正在按照您的要求執行!
<TL;DR>
此處概述的問題有三種可能的解決方案:
- 添加帶有條目的附加記錄
NULL
以“強制”表格JOIN
,- 重寫 SQL(使其性能降低 - 請參閱最後的性能分析部分),
- 更改模式(規範化) - 這是最佳的恕我直言 - 更好的性能和更好的現實表現。
</TL;DR>
我調整了您的架構,使其更符合我的口味(此分析第一部分的小提琴可在此處獲得):
CREATE TABLE level_quiz ( id INTEGER NOT NULL, level_title VARCHAR (50) NOT NULL, quiz_desc VARCHAR (200) NOT NULL, overall_quest VARCHAR (250) NOT NULL );
CREATE TABLE student_points ( student_no INTEGER NOT NULL, level_id INTEGER NOT NULL, points INTEGER NULL, -- << have to make NULLable, see below ts TIMESTAMP NULL -- << renamed timestamp to ts! );
需要注意的兩點:
- 除非您需要(或此處和此處),否則將欄位聲明為
INT(x)
其中 x 是一個數字是沒有意義的- 加上會做同樣的事情 - 再加上它使您的程式碼不可移植(見下文),ZEROFILL
LPAD
- 你永遠不應該使用
SQL keyword
(TIMESTAMP
在這種情況下) 作為表名或列名 - 這對調試不利,會產生令人困惑的錯誤消息,並且通常是不好的做法。為了使結果更簡單,我截斷瞭如下欄位:
INSERT INTO level_quiz (id, level_title, quiz_desc, overall_quest) VALUES (1, 'Sets', 'The purpose of...', 'Suppose we have... '), -- << truncated strings (2, 'Seqs', 'sequences desc...', 'overall question... '), (3, 'Prop Logic', 'logic desc ...', 'overall quest... '), (4, 'Pred Logic', 'pred desc 1 ...', 'predicase quest...');
還有兩個額外的記錄,我稍後會
INSERT
在我的分析中。INSERT INTO student_points (student_no, level_id, points, ts) VALUES (12345678, 1, 80, '2021-01-15 16:07:43'), (12345678, 2, 25, '2021-01-13 17:15:10'), (12345678, 3, 90, '2021-01-17 22:41:55'), (12345678, 4, 90, '2021-01-17 22:41:55'), (40204123, 1, 80, '2021-01-12 15:37:11'), (40204123, 2, 75, '2021-01-12 15:38:06'), (40204123, 3, 30, '2021-01-13 22:13:13'), -- (40204123, 4, NULL, NULL), -- <<< -- see below what happens when this -- record is inserted (40213894, 1, 90, '2021-01-14 21:52:00'), (40213894, 2, 95, '2021-01-17 22:42:50'), (40213894, 4, 100, '2021-01-17 22:42:50'); -- (40213894, 4, NULL, NULL), -- <<< see below also
現在,您的程式碼:
SELECT * FROM level_quiz LEFT JOIN student_points ON level_quiz.id = student_points.level_id WHERE student_points.student_no = 40204123 OR student_points.student_no IS NULL -- <<-- Makes NO difference
結果(請參閱小提琴以獲得更好的格式):
id level_title quiz_desc overall_quest student_no level_id points ts 1 Sets The purpose of... Suppose we have... 40204123 1 80 2021-01-12 15:37:11 2 Seqs sequences desc... overall question... 40204123 2 75 2021-01-12 15:38:06 3 Prop Logic logic desc ... overall quest... 40204123 3 30 2021-01-13 22:13:13
但是,您只有 3 條記錄 - 沒有對應
NULL
40204123student_no
的測驗級別 4!現在,當我得到一個奇怪的結果時,我的
"go-to"
反應是檢查 PostgreSQL 在相同情況下做了什麼。我一直發現 PostgreSQL 在幾乎所有方面都優於 MySQL。因此,與其急於向 MySQL 報告錯誤(祝你好運…… - 他們有這麼多!),你應該嘗試檢查其他伺服器 - 不太可能因為如此基本的東西而
LEFT JOIN
未被發現的根本錯誤長!結果就在這裡,可以看出PostgreSQL對於同一個查詢返回的數據是一樣的!發生什麼了?
好吧,我們現在來看看@nbk 的答案。
-- -- Solution proposed by nbk - NULLs in the result as desired! -- SELECT lq.id, lq.level_title, lq.quiz_desc, lq.overall_quest, sp.student_no, sp.level_id, sp.points, sp.ts FROM level_quiz lq LEFT JOIN ( SELECT * FROM student_points WHERE student_no = 40204123 ) sp ON lq.id = sp.level_id;
結果(在小提琴上更好地查看):
id level_title quiz_desc overall_quest student_no level_id points ts 1 Sets The purpose of... Suppose we have... 40204123 1 80 2021-01-12 15:37:11 2 Seqs sequences desc... overall question... 40204123 2 75 2021-01-12 15:38:06 3 Prop Logic logic desc ... overall quest... 40204123 3 30 2021-01-13 22:13:13 4 Pred Logic pred desc 1 ... predicase quest... NULL NULL NULL NULL
所以現在,我們顯然得到了正確的結果——用
NULL
s 表示 4 級的測驗!但是,現在讓我們看看當我們為**2**
學生執行相同的查詢時會發生什麼!-- -- bb25's original SQL - with 2 students - but no NULLs in the result! -- SELECT * FROM level_quiz LEFT JOIN student_points ON level_quiz.id = student_points.level_id WHERE student_points.student_no IN (40204123, 40213894) OR student_points.student_no IS NULL -- << Makes NO difference!
結果:
id level_title quiz_desc overall_quest student_no level_id points ts 1 Sets The purpose of... Suppose we have... 40204123 1 80 2021-01-12 15:37:11 2 Seqs sequences desc... overall question... 40204123 2 75 2021-01-12 15:38:06 3 Prop Logic logic desc ... overall quest... 40204123 3 30 2021-01-13 22:13:13 1 Sets The purpose of... Suppose we have... 40213894 1 90 2021-01-14 21:52:00 2 Seqs sequences desc... overall question... 40213894 2 95 2021-01-17 22:42:50 4 Pred Logic pred desc 1 ... predicase quest... 40213894 4 100 2021-01-17 22:42:50
出乎意料的是沒有
NULL
s - 我們有學生 40204123 的 1、2 和 3 的測驗結果和學生 40213894 的測驗 1、2 和 4。接下來,我們重新審視nbk的回答。
-- -- SQL proposed by nbk - with 2 students - but again no NULLs in the result! -- SELECT lq.id, lq.level_title, lq.quiz_desc, lq.overall_quest, sp.student_no, sp.level_id, sp.points, sp.ts FROM level_quiz lq LEFT JOIN ( SELECT * FROM student_points WHERE student_no IN (40204123, 40213894) ) sp ON lq.id = sp.level_id;
結果:
id level_title quiz_desc overall_quest student_no level_id points ts 1 Sets The purpose of... Suppose we have... 40204123 1 80 2021-01-12 15:37:11 2 Seqs sequences desc... overall question... 40204123 2 75 2021-01-12 15:38:06 3 Prop Logic logic desc ... overall quest... 40204123 3 30 2021-01-13 22:13:13 1 Sets The purpose of... Suppose we have... 40213894 1 90 2021-01-14 21:52:00 2 Seqs sequences desc... overall question... 40213894 2 95 2021-01-17 22:42:50 4 Pred Logic pred desc 1 ... predicase quest... 40213894 4 100 2021-01-17 22:42:50
再次沒有
NULL
任何地方可以看到!@nbk 的答案的結果與 OP 的 SQL的結果相同解決方案 1: - 添加一些 (2) 記錄!
所以,我們這樣做:
INSERT INTO student_points VALUES (40204123, 4, NULL, NULL), -- <<< NOW we INSERT these records! (40213894, 3, NULL, NULL);
現在,我們有所有學生的所有測驗級別的記錄 - 但顯然學生沒有完成一個級別的級別不能有分數(
points
=NULL
),也不能有沒有發生的事情的時間戳(ts
=NULL
)!因此,基本上,bb25(即 OP)的 SQL 適用於這種情況(就像 nbk 一樣),並且兩條 SQL 都適用於兩個學生以及一個學生 - 所以,添加這些記錄可以解決問題!
我在這裡只展示了 OP 的原始 SQL(對於 2 名學生)——更多內容顯示在fiddle上。
-- -- bb25 original SQL - 2 students - NULLs NOW in the result -- SELECT * FROM level_quiz LEFT JOIN student_points ON level_quiz.id = student_points.level_id WHERE student_points.student_no IN (40204123, 40213894) ORDER BY student_points.student_no, level_quiz.id;
結果(在fiddle上更好地查看):
id level_title quiz_desc overall_quest student_no level_id points ts 1 Sets The purpose of... Suppose we have... 40204123 1 80 2021-01-12 15:37:11 2 Seqs sequences desc... overall question... 40204123 2 75 2021-01-12 15:38:06 3 Prop Logic logic desc ... overall quest... 40204123 3 30 2021-01-13 22:13:13 4 Pred Logic pred desc 1 ... predicase quest... 40204123 4 1 Sets The purpose of... Suppose we have... 40213894 1 90 2021-01-14 21:52:00 2 Seqs sequences desc... overall question... 40213894 2 95 2021-01-17 22:42:50 3 Prop Logic logic desc ... overall quest... 40213894 3 4 Pred Logic pred desc 1 ... predicase quest... 40213894 4 100 2021-01-17 22:42:50
現在我們確實
NULL
在適當的地方。解決方案 2 - 更改 SQL 以使用原始數據集:
更好的解決方案可能是實際讓 SQL 生成所需的數據**,而**無需添加補充記錄 - 尤其是。帶有
NULL
s 的記錄 - 許多人認為這是有問題的。所以,這裡我只是在表中用表的s執行一個
CROSS JOIN
on以獲得所有可能的學生與測驗的組合……student_no``student_points``id``level_quiz
因此,首先我們
DELETE
為解決方案 1 工作而插入的記錄。DELETE FROM student_points WHERE points IS NULL;
然後執行這個 SQL:
SELECT distinct sp1.student_no, t1.id FROM student_points sp1 CROSS JOIN ( SELECT distinct lq.id FROM level_quiz lq ) AS t1 ORDER BY sp1.student_no, t1.id;
結果:
student_no id 12345678 1 12345678 2 12345678 3 12345678 4 40204123 1 40204123 2 40204123 3 40204123 4 40213894 1 40213894 2 40213894 3 40213894 4 12 rows
然後,我們必須將
JOIN
這些記錄返回到它們的原始表中:SELECT t2.id, SUBSTRING(lq2.level_title, 1, 6) AS "LT:", lq2.quiz_desc, lq2.overall_quest, t2.student_no, COALESCE(sp2.points, 0) AS "Points:", sp2.ts FROM ( SELECT distinct sp1.student_no, t1.id FROM student_points sp1 CROSS JOIN ( SELECT distinct lq1.id FROM level_quiz lq1 ) AS t1 ) AS t2 LEFT JOIN student_points sp2 ON t2.student_no = sp2.student_no AND t2.id = sp2.level_id JOIN level_quiz lq2 ON t2.id = lq2.id WHERE t2.student_no IN (40204123, 40213894) ORDER BY t2.student_no, t2.id;
結果:
id LT: quiz_desc overall_quest student_no Points: ts 1 Sets The purpose of... Suppose we have... 40204123 80 2021-01-12 15:37:11 2 Seqs sequences desc... overall question... 40204123 75 2021-01-12 15:38:06 3 Prop L logic desc ... overall quest... 40204123 30 2021-01-13 22:13:13 4 Pred L pred desc 1 ... predicase quest... 40204123 0 1 Sets The purpose of... Suppose we have... 40213894 90 2021-01-14 21:52:00 2 Seqs sequences desc... overall question... 40213894 95 2021-01-17 22:42:50 3 Prop L logic desc ... overall quest... 40213894 0 4 Pred L pred desc 1 ... predicase quest... 40213894 100 2021-01-17 22:42:50
我們可以看到,
NULL
作為COALESCE
函式的結果,我們有 0 分,但是現在我們失去的記錄已經“重新出現”。解決方案 3:重新設計架構:
比如說,如果我們有一個沒有參加任何測驗的學生(回想我的學生時代,這非常有可能!),我們將如何處理這種情況?
我們可以改進模式(fiddle here)。
關係(表格)是實體(事物)——我的(特別出色的)關係理論概要!:-)。現在,a
quiz
是一個“事物”,這意味著它必須對應於我們關係數據庫中的一個關係(即表)。學生也是“事物”——因此,student
需要一張桌子。“棘手”的一點是——the 和 the 之間的關係也是一個“事物” ,因此應該是一張桌子!諸如此類的實體被稱為關聯表,它們對應的表稱為關聯表——但更常見的是或表(實際上,連結上有 17 個名稱。
student``quiz
Associative Entities
joining``linking
新架構:
因此,我自己的建議是您執行以下操作:
CREATE TABLE student ( s_id INTEGER NOT NULL, s_name VARCHAR (20) NOT NULL, CONSTRAINT student_pk PRIMARY KEY (s_id) );
CREATE TABLE quiz ( q_id INTEGER NOT NULL, q_title VARCHAR (50) NOT NULL, CONSTRAINT ql_pk PRIMARY KEY (q_id) );
CREATE TABLE student_score ( ss_s_id INTEGER NOT NULL, ss_q_id INTEGER NOT NULL, score INTEGER NOT NULL, ts TIMESTAMP NOT NULL, CONSTRAINT sp_pk PRIMARY KEY (ss_s_id, ss_q_id), CONSTRAINT sp_s_no_fk FOREIGN KEY (ss_s_id) REFERENCES student (s_id), CONSTRAINT sp_ql_id FOREIGN KEY (ss_q_id) REFERENCES quiz (q_id) );
這個答案變得相當長,所以我將在這裡給出最終的 SQL(一些中間步驟顯示在小提琴中):
SELECT q.q_id, q.q_title, s.s_id, s.s_name, COALESCE(ss.score, 0) AS score FROM quiz q CROSS JOIN student s LEFT JOIN student_score ss ON ss.ss_s_id = s.s_id AND ss.ss_q_id = q.q_id ORDER BY s.s_id, q.q_id;
結果(注意
0
學生 4 的 4 秒!):q_id q_title s_id s_name score 1 Quiz 1 12345678 Student1_name 80 2 Quiz 2 12345678 Student1_name 25 3 Quiz 3 12345678 Student1_name 90 4 Quiz 4 12345678 Student1_name 90 1 Quiz 1 40204123 Student2_name 80 2 Quiz 2 40204123 Student2_name 75 3 Quiz 3 40204123 Student2_name 30 4 Quiz 4 40204123 Student2_name 0 1 Quiz 1 40213894 Student3_name 90 2 Quiz 2 40213894 Student3_name 95 3 Quiz 3 40213894 Student3_name 0 4 Quiz 4 40213894 Student3_name 100 1 Quiz 1 98765432 Student4_name 0 2 Quiz 2 98765432 Student4_name 0 3 Quiz 3 98765432 Student4_name 0 4 Quiz 4 98765432 Student4_name 0
性能分析:
使用 MySQL 8 的
EXPLAIN ANALYZE
功能,我們看到使用舊模式的工作 SQL 產生以下計劃(參見 fiddle here):EXPLAIN -> Sort: t2.student_no, t2.id (actual time=0.184..0.185 rows=8 loops=1) -> Stream results (cost=32.42 rows=320) (actual time=0.133..0.170 rows=8 loops=1) -> Left hash join (sp2.level_id = lq2.id), (sp2.student_no = t2.student_no) (cost=32.42 rows=320) (actual time=0.125..0.148 rows=8 loops=1) -> Nested loop inner join (cost=5.85 rows=32) (actual time=0.082..0.100 rows=8 loops=1) -> Table scan on lq2 (cost=0.65 rows=4) (actual time=0.005..0.015 rows=4 loops=1) -> Index lookup on t2 using <auto_key2> (id=lq2.id) (actual time=0.001..0.002 rows=2 loops=4) -> Materialize (cost=4.60 rows=8) (actual time=0.020..0.021 rows=2 loops=4) -> Table scan on <temporary> (actual time=0.000..0.001 rows=8 loops=1) -> Temporary table with deduplication (cost=4.60 rows=8) (actual time=0.062..0.063 rows=8 loops=1) -> Inner hash join (no condition) (cost=4.60 rows=8) (actual time=0.046..0.049 rows=24 loops=1) -> Table scan on t1 (cost=1.48 rows=4) (actual time=0.000..0.001 rows=4 loops=1) -> Materialize (cost=0.65 rows=4) (actual time=0.021..0.022 rows=4 loops=1) -> Table scan on <temporary> (actual time=0.000..0.001 rows=4 loops=1) -> Temporary table with deduplication (cost=0.65 rows=4) (actual time=0.016..0.017 rows=4 loops=1) -> Table scan on lq1 (cost=0.65 rows=4) (actual time=0.004..0.008 rows=4 loops=1) -> Hash -> Filter: (sp1.student_no in (40204123,40213894)) (cost=1.25 rows=2) (actual time=0.010..0.016 rows=6 loops=1) -> Table scan on sp1 (cost=1.25 rows=10) (actual time=0.005..0.014 rows=10 loops=1) -> Hash -> Table scan on sp2 (cost=0.16 rows=10) (actual time=0.015..0.026 rows=10 loops=1)
使用 PostgreSQL 功能的相同 SQL顯示了一個非常複雜的計劃(請參閱此小提琴
EXPLAIN (ANALYZE, BUFFERS, COSTS, TIMING)
的底部)。具有修改後架構的 SQL 的小提琴如下(見底部):
EXPLAIN -> Nested loop left join (cost=2.05 rows=4) (actual time=0.018..0.031 rows=4 loops=1) -> Table scan on q (cost=0.65 rows=4) (actual time=0.011..0.015 rows=4 loops=1) -> Single-row index lookup on ss using PRIMARY (ss_s_id=40204123, ss_q_id=q.q_id) (cost=0.28 rows=1) (actual time=0.003..0.003 rows=1 loops=4)
並在這裡檢查 PostgreSQL :
QUERY PLAN Hash Left Join (cost=14.64..46.31 rows=540 width=188) (actual time=0.077..0.083 rows=4 loops=1) Hash Cond: ((s.s_id = ss.ss_s_id) AND (q.q_id = ss.ss_q_id)) Buffers: shared hit=5 -> Nested Loop (cost=0.15..28.97 rows=540 width=184) (actual time=0.032..0.035 rows=4 loops=1) Buffers: shared hit=3 -> Index Scan using student_pk on student s (cost=0.15..8.17 rows=1 width=62) (actual time=0.020..0.021 rows=1 loops=1) Index Cond: (s_id = 40204123) Buffers: shared hit=2 -> Seq Scan on quiz q (cost=0.00..15.40 rows=540 width=122) (actual time=0.008..0.009 rows=4 loops=1) Buffers: shared hit=1 -> Hash (cost=14.37..14.37 rows=8 width=12) (actual time=0.027..0.027 rows=3 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 9kB Buffers: shared hit=2 -> Bitmap Heap Scan on student_score ss (cost=4.21..14.37 rows=8 width=12) (actual time=0.017..0.019 rows=3 loops=1) Recheck Cond: (ss_s_id = 40204123) Heap Blocks: exact=1 Buffers: shared hit=2 -> Bitmap Index Scan on sp_pk (cost=0.00..4.21 rows=8 width=0) (actual time=0.006..0.007 rows=3 loops=1) Index Cond: (ss_s_id = 40204123) Buffers: shared hit=1 Planning Time: 0.254 ms Execution Time: 0.188 ms 22 rows
因此,座右銘似乎是一個良好規範化的模式會產生 a) - 正確的結果,或者至少是更容易和更高效的正確結果!很高興知道!