Postgresql
按完全匹配(jsonb 數組)排序,然後是詞位相似度
假設我在 Postgres 9.5 中有這樣的表:
CREATE TABLE public.posts ( content text, tags jsonb, )
我想設計一個查詢:
- 根據精確標籤或匹配 (
tags ?! array['tag1','tag2']
) 和匹配的自由格式文本 (to_tsvector(content) @@ plainto_tsquery('some phrase')
)查找結果- 根據匹配數首先按
tags
匹配排序 - 即,如果 A 行有$$ apple, orange $$B行剛剛$$ apple $$,然後搜尋
array['orange', 'apple']
將產生高於 B 行的 A 行(但它們都會被返回)
content
根據返回結果的權重/相似度按秒排序。因此,對關鍵字的搜尋'french hello'
將產生一行包含*“我如何用法語說‘你好’?’的內容。高於“法國里維埃拉的天氣怎麼樣?”*我將如何在單個查詢中組合以上內容,以便完全匹配和/或模糊匹配都使用上述權重產生結果?
tags``content
檢查這是否是您要查找的內容
$$ I didn’t fully understand your first OR / AND condition. I assumed it was just an OR $$.
WITH posts AS ( SELECT * FROM (VALUES ('how do I say ''hello'' in French?', '{"orange":1, "apple":2}'::jsonb), ('what''s the weather like in the French Riviera?', '{"peach":3, "lemon":4}'::jsonb), ('awful weather in England', '{"peach":5, "lemon":6}'::jsonb), ('awful weather in England', '{"pineapple":5, "strawberry":6}'::jsonb), ('doubtful french fries', '{"blueberry":5, "pear":6}'::jsonb), ('the rain, in Spain, is mainly in the plain', '{"melon":7, "watermelon":8, "banana":9}'::jsonb) ) AS posts(content, tags) ) SELECT *, /* Use ts_rank to compare level of full text search coincidence */ ts_rank(to_tsvector(content), plainto_tsquery('french') || plainto_tsquery('hello')) AS rank, /* Subquery to count number of tag matches */ (SELECT count(case when tags ? a then 1 end) FROM unnest(array['melon', 'banana', 'lemon']) AS a ) AS number_of_matching_tags FROM posts WHERE /* Check for any of the tags */ tags ?| array['melon', 'banana', 'lemon'] OR /* Check for any of the search terms. You have to || tsqueries */ to_tsvector(content) @@ (plainto_tsquery('french') || plainto_tsquery('hello')) ORDER BY number_of_matching_tags desc nulls last, rank desc ;
(包含 number_of_matching_tags 和 rank 列只是為了澄清結果)