隨機值重複,無法選擇N個隨機配對,子查詢不刷新 - sqlite
問題
使用 SQLite v3.35.4 和 v3.36.0 我有一個
first_name
表和一個surname
包含常用名稱列表的表。我想在一個新表中生成 N 個配對。我寫了這個遞歸查詢:
WITH RECURSIVE cte(first_name, surname) AS ( SELECT first_name, surname from ( -- always returns the same value select first_name, surname from (select first_name from first_name order by random() limit 1) join (select surname from surname order by random() limit 1) ) UNION ALL SELECT first_name, surname FROM cte LIMIT 2000 ) SELECT first_name, surname FROM cte;
不幸的是,輸出看起來像這樣:
+------------+---------+ | first_name | surname | +------------+---------+ | james | smith | | james | smith | | james | smith | | ---------- | ------- | | ... | ... | +------------+---------+
我試過的
在查看 SQLite 文件後,我嘗試了遞歸 CTE 和子查詢展平部分
NOT MATERIALIZED
概述的幾個條件。我將隨機名稱選擇放在一個視圖中。然而,這些都沒有對結果產生積極影響。有沒有辦法執行我正在嘗試做的事情?
*編輯
我嘗試了一個視窗函式並從 where 子句中隨機選擇名稱,但沒有成功:(其中 1998 是表的大小)
with recursive r_first_name as ( select first_name, ROW_NUMBER() over(order by random()) as rn from first_name ), r_surname as ( select surname, ROW_NUMBER() over(order by random()) as rn from surname ), rcte(first_name, surname) as ( select first_name, surname from r_first_name rf join r_surname rs on rs.rn = (select abs(random() % 1998)) where rf.rn = (select abs(random() % 1998)) union all select first_name, surname from rcte limit 3000 ) select * from rcte
!!!解決方案 !!!
在查看了類似問題的答案後。
我發現在 CTE 的遞歸方面,a
random()
將成功更新。雖然不幸的是,它在嵌套在子查詢中時不會更新,但如果它位於 CTE 遞歸的“根”,我可以利用它來獲取隨機數。以下是我開發的解決方案。它符合我的特定案例,並且與交叉連接相比性能相對較高:
WITH RECURSIVE cte AS ( select abs(random()) % (select count(*) from first_name) as first_name_num, abs(random()) % (select count(*) from surname) as surname_num union all select abs(random()) % (select count(*) from first_name) as first_name_num, abs(random()) % (select count(*) from surname) as surname_num from cte LIMIT 6000 ), result as ( select * from cte join (select first_name, ROW_NUMBER() over (order by random()) as rn from first_name) fn -- this is always the same result on cte.first_name_num = fn.rn join (select surname, ROW_NUMBER() over (order by random()) as rn from surname) sn -- this updates every loop around except subqueries are compiled/cached or something so they are unusable here if you want updated values on cte.surname_num = sn.rn ) select first_name, surname from result
“我想在一個新表中生成 N 個配對。 ”為什麼不簡單地做一個這樣的
CROSS JOIN
withLIMIT
子句呢?SELECT fn.first_name, sn.surname FROM first_name fn CROSS JOIN surname sn LIMIT 2000;
這應該會產生一個不同的 2,000 行列表
first_name
和surname
組合(假設您在每個表中都沒有重複)。這只是偽隨機的,因為結果是不確定的,因為沒有ORDER BY
子句。但它比查詢要簡單得多,尤其是當您只需要載入一次表時。您還可以添加一個帶有函式的
ORDER BY
子句來提高上述查詢的隨機性,如下所示:RANDOM()
SELECT fn.first_name, sn.surname FROM first_name fn CROSS JOIN surname sn ORDER BY RANDOM() LIMIT 2000;
我的直覺告訴我,這也不是完全隨機的,但應該比我的第一個查詢好得多,如果你確實需要一些隨機性的話。
注意我還建議不要將列命名為與它們所屬的表完全相同的名稱。這可能會在語法上造成混淆,並使可讀性更加困難。就我個人而言,我有一個
names
帶有欄位的表來表示它的名稱類型(名字與姓氏等)。但是,如果您想要兩個顯式表,那麼我只需將列命名為通用名稱,name
因為當您選擇first_name.name
.