Sql-Server

計算列中多列的 Sql Max 值

  • July 31, 2020

在本文解決方案 1 中,它討論了從許多列中找到最大值。我想在計算/持久列中執行此操作。我該怎麼做?

https://www.mssqltips.com/sqlservertip/4067/find-max-value-from-multiple-columns-in-a-sql-server-table/

create table dbo.TestAmount
(
   Amount1 int,
   Amount2 int,
   Amount3 int,
   MaxValuedata as (select MAX(MaxAmount) FROM (VALUES (Amount1),(Amount2),(Amount3)) AS MaxAmount(LastAmount)) 
)

將來可能有 10 個值,試圖防止長 case 語句。

第一件事……我不是在提倡你這樣做……我只是向你展示如何做到這一點。如果您的表經歷大量插入和/或更新,您可能會看到明顯的性能下降。在計算列中使用標量 UDF 將強製針對錶的所有查詢以串列方式執行

首先創建一個類似於以下的標量函式…

CREATE FUNCTION dbo.GreatestOfThreeInts
/* ================================================================================================
Scalar function created for the sole purpose of calculating the MaxVal computed column on dbo.Test.
================================================================================================ */
(
   @C1 INT,
   @C2 INT,
   @C3 INT
)
RETURNS INT WITH SCHEMABINDING --, RETURNS NULL ON NULL INPUT --<< use this if NULLs are a possibility..
AS 
BEGIN 
   DECLARE @MaxVal INT = ( SELECT MAX(x.Val) FROM ( VALUES (@C1), (@C2), (@C3) ) x (Val) );
   RETURN ISNULL(@MaxVal, 0);
END;
GO

使用 PERSISTED 計算列創建表,或者,如果表已經存在,則使用 ALTER / ADD 語法添加 PERSISTED 計算列…

CREATE TABLE dbo.Test (
   C1 INT NOT NULL,
   C2 INT NOT NULL,
   c3 INT NOT NULL,
   MaxVal AS dbo.GreatestOfThreeInts(C1, C2, C3) PERSISTED NOT NULL    -- persist the value so that it doesn't need to be constantly recomputed
   );
GO

CREATE NONCLUSTERED INDEX ix_Test_MaxVal ON dbo.Test (MaxVal) INCLUDE (C1, C2, c3);
GO 

為什麼我一直說 PERSISTED?… 買一次,哭一次… 除非您有一個非常頻繁的使用模式,否則您最好計算插入和更新的值,而不是每次引用中的列一個選擇…特別是如果該列將用於謂詞或排序操作。

太好了……讓我們看看它的實際效果……

INSERT dbo.Test (C1, C2, c3) VALUES
   (123,456,789),
   (345,478,123),
   (523,321,852),
   (111,471,951),
   (874,320,357),
   (965,102,478);
GO 

SELECT * FROM dbo.Test t ORDER BY t.MaxVal OPTION(QUERYTRACEON 176);
GO 

SELECT * FROM dbo.Test t WHERE t.MaxVal >= 800 AND t.MaxVal < 900 OPTION(QUERYTRACEON 176);
GO 

結果…

C1          C2          c3          MaxVal
----------- ----------- ----------- -----------
345         478         123         478
123         456         789         789
523         321         852         852
874         320         357         874
111         471         951         951
965         102         478         965


C1          C2          c3          MaxVal
----------- ----------- ----------- -----------
523         321         852         852
874         320         357         874

希望這會有所幫助,傑森


編輯#1:非常感謝 Erik 添加連結,指出使用標量 UDF 計算列將阻止優化器考慮並行執行計劃……即使計算的列是持久的。我實際上知道但在我最初的答案中完全忽略的事實。我不知道的是OPTION(QUERYTRACEON 176)……撿起那個小金塊,超過了我的入場費!

編輯#2:在不引起關於“NULL vs NOT NULL”列約束的宗教辯論的情況下,我將簡單地聲明我個人的“預設”是使所有列不為空,除非有令人信服的理由不這樣做……也就是說,@MartinSmith 提出了一些優點……包括 OP 通過不指定 NULLability 使所有列都可以為 NULL 的事實。另外,在來回之後,我只是想看看RETURN ISNULL(@MaxVal, 0);除了激怒閱讀 T_SQL 的人之外,它是否在做任何事情……簡短的回答……它沒有。

以下包括引入“控制”表(無計算列)和 dbo.GreatestOfThreeInts & dbo.Test (dbo.GreatestOfThreeInts_2 & dbo.Test_2) 的 NULLable 版本

CREATE FUNCTION dbo.GreatestOfThreeInts_2
/* ==================================================================================================
Scalar function created for the sole purpose of calculating the MaxVal computed column on dbo.Test_2.
================================================================================================== */
(
   @C1 INT,
   @C2 INT,
   @C3 INT
)
RETURNS INT WITH SCHEMABINDING
AS 
BEGIN 
   DECLARE @MaxVal INT = ( SELECT MAX(x.Val) FROM ( VALUES (@C1), (@C2), (@C3) ) x (Val) );
   RETURN @MaxVal;
END;
GO

CREATE TABLE dbo.Test_2 (
   C1 INT NULL,
   C2 INT NULL,
   c3 INT NULL,
   MaxVal AS dbo.GreatestOfThreeInts_2(C1, C2, C3) PERSISTED   -- persist the value so that it doesn't need to be constantly recomputed
   );
GO

CREATE NONCLUSTERED INDEX ix_Test2_MaxVal ON dbo.Test_2 (MaxVal) INCLUDE (C1, C2, c3);
GO 

CREATE TABLE dbo.Control (
   C1 int NOT NULL,
   C2 int NOT NULL,
   c3 int NOT NULL,
   MaxVal INT NOT NULL
   );
GO

CREATE NONCLUSTERED INDEX ix_Control_MaxVal ON dbo.Control (MaxVal) INCLUDE (C1, C2, c3);
GO

而且因為我原始答案中的 6 行並不是一個測試,所以下面將載入所有 3 個表,其中包含 100 萬行測試數據……

-- clear out any existing data...
TRUNCATE TABLE dbo.Test;
GO 
TRUNCATE TABLE dbo.Test_2;
GO 
TRUNCATE TABLE dbo.Control;
GO 

-- add 1M rows of test data...
WITH 
   cte_n1 (n) AS (SELECT 1 FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) n (n)), 
   cte_n2 (n) AS (SELECT 1 FROM cte_n1 a CROSS JOIN cte_n1 b),
   cte_n3 (n) AS (SELECT 1 FROM cte_n2 a CROSS JOIN cte_n2 b),
   cte_Tally (c1, c2, c3) AS (
       SELECT TOP (1000000)
           ABS(CHECKSUM(NEWID())) % 9000 + 1000,   -- randomly generate INTs between 1000 and 9999
           ABS(CHECKSUM(NEWID())) % 9000 + 1000,   -- no, I don't have an actual reason for using that specific range...
           ABS(CHECKSUM(NEWID())) % 9000 + 1000    -- 
       FROM
           cte_n3 a CROSS JOIN cte_n3 b
       )
INSERT dbo.Test (C1, C2, c3)
SELECT 
   t.c1, 
   t.c2, 
   t.c3
FROM
   cte_Tally t;
GO

-- use dbo.Test to insert dbo.Test_2 & dbo.Control so all 3 tables will have the exact same data values...
-- (to compare actual insert performance, use the cte_Tally to load all tables)

INSERT dbo.Test_2 (C1, C2, c3)
SELECT 
   t.C1, t.C2, t.c3
FROM
   dbo.Test t;
GO

INSERT dbo.Control (C1, C2, c3, MaxVal)
SELECT 
   t.C1, t.C2, t.c3, t.MaxVal
FROM
   dbo.Test t;
GO 

引用自:https://dba.stackexchange.com/questions/225895