CTE 可以簡化重複的和可能的遞歸連接嗎?
我有如下圖所示的三個表,顯示外鍵和唯一欄位(綠色):
這表示伺服器收到的電子郵件消息。(對不起,我不能再簡化了。完整版在這裡。)每條消息都根據SPF和DKIM進行身份驗證,從而連結到相應的域。一個 SPF 認證可以連結到兩個域,即所謂的helo 域,即發送伺服器的名稱,以及退回地址域,通常是作者的電子郵件地址。相反,DKIM 允許傳遞消息的任何域添加一個或多個簽名,以便對消息本身承擔一些責任。通常,一條消息帶有一個或兩個 DKIM 簽名,但可以有更多。
多對多
msg_ref
表的創建方式如下(參見db<>fiddle以了解表的創建和範例數據):CREATE TABLE msg_ref ( id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY, message_in INT UNSIGNED NOT NULL COMMENT 'Foreign key to message_in', domain INT UNSIGNED NOT NULL COMMENT 'Foreign key to domain', auth SET ('author', 'spf_helo', 'spf', 'dkim', 'org', 'dmarc') NOT NULL, spf ENUM ('none', 'neutral', 'pass', 'fail', 'softfail', 'temperror', 'permerror') NOT NULL, dkim ENUM ('none', 'pass', 'fail', 'policy', 'neutral', 'temperror', 'permerror') NOT NULL, dkim_order TINYINT UNSIGNED NOT NULL DEFAULT 0, INDEX by_dom_msg(domain, message_in), INDEX by_msg_auth(message_in, auth) )
為了提供有關身份驗證的回饋,DMARC 引入了在一天結束時將匯總報告發送給請求它們的域。報告向發送者傳達接收者執行的身份驗證檢查的結果。它由包含最後一個中繼的源 IP(來自 message_in,已轉換)、具有相同身份驗證結果的消息數量以及正確的結果(來自 msg_ref)的多行組成。
這是一個提取 SPF 和最多四個 DKIM 結果的查詢:
SELECT INET_NTOA(CONV(HEX(m.ip),16,10)) AS source, COUNT(DISTINCT(m.id)) AS count,\ m.dmarc_dispo AS disposition, \ da.domain AS author,\ dspf.domain AS spf, rspf.spf AS spf_result,\ d1.domain AS dkim1, r1.dkim AS dkim1_result,\ d2.domain AS dkim2, r2.dkim AS dkim2_result,\ d3.domain AS dkim3, r3.dkim AS dkim3_result,\ d4.domain AS dkim4, r4.dkim AS dkim4_result\ FROM message_in AS m\ LEFT JOIN (msg_ref AS rd INNER JOIN domain AS dd ON rd.domain = dd.id)\ ON m.id = rd.message_in AND FIND_IN_SET('dmarc', rd.auth)\ LEFT JOIN (msg_ref AS ra INNER JOIN domain AS da ON ra.domain = da.id)\ ON m.id = ra.message_in AND FIND_IN_SET('author', ra.auth)\ LEFT JOIN (msg_ref AS rspf INNER JOIN domain AS dspf ON rspf.domain = dspf.id)\ ON m.id = rspf.message_in AND FIND_IN_SET('spf', rspf.auth)\ LEFT JOIN (msg_ref AS rhelo INNER JOIN domain AS dhelo ON rhelo.domain = dhelo.id)\ ON m.id = rhelo.message_in AND FIND_IN_SET('spf_helo', rhelo.auth)\ LEFT JOIN (msg_ref AS r1 INNER JOIN domain AS d1 ON r1.domain = d1.id)\ ON m.id = r1.message_in AND r1.dkim_order = 1\ LEFT JOIN (msg_ref AS r2 INNER JOIN domain AS d2 ON r2.domain = d2.id)\ ON m.id = r2.message_in AND r2.dkim_order = 2\ LEFT JOIN (msg_ref AS r3 INNER JOIN domain AS d3 ON r3.domain = d3.id)\ ON m.id = r3.message_in AND r3.dkim_order = 3\ LEFT JOIN (msg_ref AS r4 INNER JOIN domain AS d4 ON r4.domain = d4.id)\ ON m.id = r4.message_in AND r4.dkim_order = 4\ GROUP BY source, disposition, author,\ spf, spf_result,\ dkim1, dkim1_result,\ dkim2, dkim2_result,\ dkim3, dkim3_result,\ dkim4, dkim4_result
未顯示,有一個
WHERE
條款將輸出限制在給定報告期內的特定rd.domain
和m.mtime
謊言。結果是 DMARC 報告的內容,如wikipedia所述:+-----------+-------+-------------+-------------+-------------+------------+-------------+--------------+-------------+--------------+-------------+--------------+-------------+--------------+ | source | count | disposition | Header from | spf | spf_result | dkim1 | dkim1_result | dkim2 | dkim2_result | dkim3 | dkim3_result | dkim4 | dkim4_result | +-----------+-------+-------------+-------------+-------------+------------+-------------+--------------+-------------+--------------+-------------+--------------+-------------+--------------+ | 192.0.2.1 | 12 | none | example.com | example.com | pass | example.com | pass | example.com | pass | example.net | pass | example.net | pass | | 192.0.2.1 | 1 | none | example.com | example.com | pass | example.com | pass | example.net | pass | example.net | pass | NULL | NULL | +-----------+-------+-------------+-------------+-------------+------------+-------------+--------------+-------------+--------------+-------------+--------------+-------------+--------------+ 2 rows in set (0.004 sec)
編輯問題:
(刪除了非工作嘗試和可能模棱兩可的措辭。首先重新排序最重要的。)
- 如何使用遞歸 CTE表達查詢,以使結果行包含可變數量的
dkim<n>
和dkim<n>_result
列,涵蓋出現在相關消息中的所有 DKIM 身份驗證。- 如何通過分解重複表達式來顯著簡化查詢。
注意: 查詢是通過OpenDBX過濾的。它可以使用多個 DBMS,MariaDB v.15 是一個合理的目標。
像大多數(全部?)SQL 查詢一樣,結果中的列數是固定的。遞歸 CTE 只能添加行,不能添加列。
您很幸運,要生成 ARF 報告的結果是 XML。因此,您可以使用舊的GROUP_CONCAT來生成 DKIM 的 XML 形式。
我還沒有查看報告的 XML 規範,但是表格中有一些東西
SELECT ..., GROUP_CONCAT( CONCAT('<dkimresults><dkim_domain>', d1.domain, '<dkim_domain><dkim_result>', r1.dkim, '</dkim_result></dkim_results>' ) ORDER BY r1.dkim_order SEPARATOR '') dkim_result FROM message_in AS m ....
其他建議:
- 使用 InnoDB 作為表類型而不是 MyISAM/Aria
- 如果 MariaDB-10.5 是唯一的應用程序,請使用10.5 中引入的INET6數據類型
- DKIM 列舉可能符合規範中的定義,並且IETF 的 DKIM 工作組是“Concluded WG”。所以沒有更多的事情發生。
- 對於非 dkim - 可能
message_id
每次都分開 , auth 域列的表。然後根據需要加入那些。參考:小提琴