將 varchar 數據轉換為日期時間失敗
我們正在將數據從遺留表(所有 varchar 欄位)移動到強類型對應提示歡呼
作為這項工作的一部分,如果所有數據都可以正確轉換為適當的類型,我們將從基本 Entity 表中獲取數據並將其轉儲到 Entity_New 中。否則,它將進入名為 Entity_Bad 的現有表的副本。
我們有一個規則引擎來驗證數據和類型,理論上,即使數據儲存在字元欄位中,數據也應該是乾淨的。現實是我在這裡發帖,因為有些東西已經關閉,我找不到它。Entity 中的 CompletionDate 欄位為 varchar(46) NULL
環境是
productversion productlevel edition 10.0.4064.0 SP2 Enterprise Edition (64-bit)
我的腳本展示了我在做什麼以及什麼是但不工作
SET NOCOUNT ON DECLARE @startid int = 0 , @stopid int = 796833 ------------------------------------------------------------------------------- -- Check doesn't find anything wrong with CompletionDate ------------------------------------------------------------------------------- SELECT 1 FROM [dbo].[Entity] E INNER JOIN dbo.EntityBatch_New PB ON E.FiscalYear = PB.FiscalYear AND E.HashCode = PB.HashCode AND E.AAKey = PB.AAKey AND PB.ProcessResultCode IN ('A','W','M') WHERE PB.EntityBatchId BETWEEN @StartId AND @StopId AND ( -- check (isDate(E.[CompletionDate]) = 0 AND E.[CompletionDate] IS NOT NULL) AND (isDate(E.[CompletionDate]) = 1 AND CAST(E.[CompletionDate] AS datetime) BETWEEN '1753-01-01T00:00:00.000' AND '9999-12-31T23:59:59.997') ) ------------------------------------------------------------------------------- -- Only row that shows as non-date is the NULL one, which is expected ------------------------------------------------------------------------------- SELECT DISTINCT (E.[CompletionDate] ) , isDate(E.[CompletionDate]) FROM [dbo].[Entity] E INNER JOIN dbo.EntityBatch_New PB ON E.FiscalYear = PB.FiscalYear AND E.HashCode = PB.HashCode AND PB.ProcessResultCode IN ('A','W','M') -- Ensure we aren't pulling something we have already processed LEFT OUTER JOIN [dbo].[Entity_new] N ON N.HashCode = E.HashCode AND N.FiscalYear = E.FiscalYear AND E.AAKey = N.AAKey -- Ensure we aren't pulling something we have already processed (or was bad) LEFT OUTER JOIN [dbo].[Entity_bad] BAD ON BAD.HashCode = E.HashCode AND BAD.FiscalYear = E.FiscalYear AND E.AAKey = BAD.AAKey WHERE PB.EntityBatchId BETWEEN @StartId AND @StopId AND N.FiscalYear IS NULL AND BAD.FiscalYear IS NULL ORDER BY 2 ------------------------------------------------------------------------------- -- Make the cast and it blows with -- The conversion of a varchar data type to a datetime data type resulted in an out-of-range value. ------------------------------------------------------------------------------- SELECT DISTINCT (E.[CompletionDate] ) , CAST(E.[CompletionDate] AS datetime) AS [CompletionDate] FROM [dbo].[Entity] E INNER JOIN dbo.EntityBatch_New PB ON E.FiscalYear = PB.FiscalYear AND E.HashCode = PB.HashCode AND PB.ProcessResultCode IN ('A','W','M') -- Ensure we aren't pulling something we have already processed LEFT OUTER JOIN [dbo].[Entity_new] N ON N.HashCode = E.HashCode AND N.FiscalYear = E.FiscalYear AND E.AAKey = N.AAKey -- Ensure we aren't pulling something we have already processed (or was bad) LEFT OUTER JOIN [dbo].[Entity_bad] BAD ON BAD.HashCode = E.HashCode AND BAD.FiscalYear = E.FiscalYear AND E.AAKey = BAD.AAKey WHERE PB.EntityBatchId BETWEEN @StartId AND @StopId AND N.FiscalYear IS NULL AND BAD.FiscalYear IS NULL ------------------------------------------------------------------------------- -- Dump the values into a temporary table to slice and dice the values ------------------------------------------------------------------------------- DECLARE @debug TABLE ( CompletionDate varchar(46) NULL ) INSERT INTO @debug SELECT DISTINCT (E.[CompletionDate] ) FROM [dbo].[Entity] E INNER JOIN dbo.EntityBatch_New PB ON E.FiscalYear = PB.FiscalYear AND E.HashCode = PB.HashCode AND PB.ProcessResultCode IN ('A','W','M') -- Ensure we aren't pulling something we have already processed LEFT OUTER JOIN [dbo].[Entity_new] N ON N.HashCode = E.HashCode AND N.FiscalYear = E.FiscalYear AND E.AAKey = N.AAKey -- Ensure we aren't pulling something we have already processed (or was bad) LEFT OUTER JOIN [dbo].[Entity_bad] BAD ON BAD.HashCode = E.HashCode AND BAD.FiscalYear = E.FiscalYear AND E.AAKey = BAD.AAKey WHERE PB.EntityBatchId BETWEEN @StartId AND @StopId AND N.FiscalYear IS NULL AND BAD.FiscalYear IS NULL ------------------------------------------------------------------------------- -- This is operating on all the same values as the failing query but magically works ------------------------------------------------------------------------------- SELECT ALL CAST(E.[CompletionDate] AS datetime) AS [CompletionDate] FROM @debug E ------------------------------------------------------------------------------- -- Clearly, something is amiss when we extract the data so process each row -- and find the culprit that way. Except this finds nothing wrong ------------------------------------------------------------------------------- DECLARE @hash uniqueidentifier , @zee_date varchar(46) , @real_date datetime DECLARE CSR CURSOR READ_ONLY FOR SELECT E.HashCode , E.[CompletionDate] FROM [dbo].[Entity] E INNER JOIN dbo.EntityBatch_New PB ON E.FiscalYear = PB.FiscalYear AND E.HashCode = PB.HashCode AND PB.ProcessResultCode IN ('A','W','M') -- Ensure we aren't pulling something we have already processed LEFT OUTER JOIN [dbo].[Entity_new] N ON N.HashCode = E.HashCode AND N.FiscalYear = E.FiscalYear AND E.AAKey = N.AAKey -- Ensure we aren't pulling something we have already processed (or was bad) LEFT OUTER JOIN [dbo].[Entity_bad] BAD ON BAD.HashCode = E.HashCode AND BAD.FiscalYear = E.FiscalYear AND E.AAKey = BAD.AAKey WHERE PB.EntityBatchId BETWEEN @StartId AND @StopId AND N.FiscalYear IS NULL AND BAD.FiscalYear IS NULL OPEN CSR FETCH NEXT FROM CSR INTO @hash, @zee_date WHILE (@@fetch_status = 0) BEGIN BEGIN TRY SELECT @real_date = cast(@zee_date AS datetime) END TRY BEGIN CATCH print 'In here' print @hash print @zee_date SELECT @hash, @zee_date END CATCH FETCH NEXT FROM CSR INTO @hash, @zee_date END CLOSE csr DEALLOCATE csr
在用上面的程式碼牆批評你之後,這裡是上面查詢操作的 406 個唯一值。
DECLARE @REAL_DATES TABLE ( CompletionDate varcharompletionDate AS datetime) AS casts_fine FROm @REAL_DATES RD
我很欣賞關於 SSIS 或其他方法的評論,但在遊戲的這一點上,我們已經與 TSQL 轉換方法結合了。如果有人能指出我遺漏了什麼,我會以你的名字命名我的第一個孩子,假設你不介意把你的名字改成詹姆斯。
當你這樣做時會發生什麼?
DECLARE @REAL_DATES TABLE ( CompletionDate VARCHAR(46) ); INSERT INTO @REAL_DATES SELECT CompletionDate FROM dbo.Entity; SELECT CAST(RD.CompletionDate AS datetime) AS casts_fine FROM @REAL_DATES RD;
我在 twitter 上得到的是優化器可以在消除行之前嘗試轉換,因此您不能只考慮連接返回的 CompletionDate 值。
我的第一個建議是使用正確的數據類型。為什麼要使用 VARCHAR(46) 來儲存日期?這就是為什麼表中有錯誤數據以及為什麼當您想要不是字元串的豐富數據時必須顯式轉換的原因(恕我直言,首先不應該是字元串)。
我的下一個建議是更正該列中的所有數據,並採取措施使其不會再次失效。例如,驗證
ISDATE(columnname) = 1
.如果這兩個失敗,我列表中的下一個是將數據返回給客戶端並讓它轉換為日期時間或顯示或你有什麼。無論您在哪裡過濾掉導致問題的行,優化器都可以推動該評估,以便在清除壞行之前嘗試轉換。
最後,您可以將查詢結果轉儲到臨時表/表變數中,並在查詢該中間對象時作為第二步執行轉換(因為您應該確信這裡的日期是有效的 - 實際上您可以檢查首先,如果您的聯接碰巧返回了一些日期無效的行,則會引發錯誤)。
底線:(a)您不能對將在堆棧中的哪個位置進行轉換嘗試做出任何假設,並且(b)如果您使用正確的數據類型,則不需要這些變通方法和黑客攻擊。