Sql-Server

SQL Server 2016 和 SQL Server 2019 之間的分佈式可用性組不播種

  • February 19, 2022

很長的第一篇文章的道歉:-)

我計劃在現有 SQL2016 AG 和新 SQL Server 2019 AG 之間實施分佈式可用性組,並在建構之前在 LAB 中驗證設計。據我所知,應該支持這個 Dist AG,但我無法讓它工作。

實驗室包括:

  1. LAB1SQL01 - Windows Server Core 2016(評估許可證),執行 SQL Server 2016 (13.0.5026.0)
  2. LAB1SQL02 - Windows Server Core 2016(評估許可證),執行 SQL Server 2016 (13.0.5026.0)
  3. LAB1SQL03 - Windows Server 2019(評估許可證),執行 SQL Server 2019 (15.0.4023.6)
  4. LAB1SQL04 - Windows Server 2019(評估許可證),執行 SQL Server 2019 (15.0.4023.6)

我在所有伺服器上使用相同的組託管服務帳戶

我已經部署了以下 AlwaysOn 可用性組:

  1. LAB1SQLAG1 - 伺服器:LAB1SQL01 & LAB1SQL02,監聽器:LAB1SQLAG1L1
  2. LAB1SQLAG3 - 伺服器:LAB1SQL03 & LAB1SQL04,監聽器:LAB1SQLAG3L1
-- Create endpoints
-- Run on LAB1SQL01
USE [master]
GO
CREATE ENDPOINT [Endpoint_AG]
  STATE=STARTED
  AS TCP (LISTENER_PORT = 5022, LISTENER_IP = ALL)
  FOR DATA_MIRRORING (ROLE = ALL, AUTHENTICATION = WINDOWS NEGOTIATE
      , ENCRYPTION = REQUIRED ALGORITHM AES)
GO 

-- Run onLAB1SQL02
USE [master]
GO
CREATE ENDPOINT [Endpoint_AG]
  STATE=STARTED
  AS TCP (LISTENER_PORT = 5022, LISTENER_IP = ALL)
  FOR DATA_MIRRORING (ROLE = ALL, AUTHENTICATION = WINDOWS NEGOTIATE
      , ENCRYPTION = REQUIRED ALGORITHM AES)
GO

-- Create login and grant the SQL Server service account CONNECT permissions to the endpoint
-- Run on LAB1SQL01
USE master
GO
CREATE LOGIN [LAB1\svc.sql01.lab1$] FROM WINDOWS;
GO
GRANT CONNECT ON ENDPOINT::Endpoint_AG
TO [LAB1\svc.sql01.lab1$]; 
GO

-- Run on LAB1SQL02
USE master
GO
CREATE LOGIN [LAB1\svc.sql01.lab1$] FROM WINDOWS;
GO
GRANT CONNECT ON ENDPOINT::Endpoint_AG
TO [LAB1\svc.sql01.lab1$]; 
GO


-- Create an Availability Group LAB1SQLAG1
-- Run on LAB1SQL01
CREATE AVAILABILITY GROUP [LAB1SQLAG1]
FOR 
REPLICA ON
N'LAB1SQL01' WITH
(
  ENDPOINT_URL = N'TCP://LAB1SQL01.lab1.local:5022', 
  FAILOVER_MODE = MANUAL, 
  AVAILABILITY_MODE = SYNCHRONOUS_COMMIT, 
  SEEDING_MODE = AUTOMATIC
), 
N'LAB1SQL02' WITH
(  ENDPOINT_URL = N'TCP://LAB1SQL02.lab1.local:5022', 
  FAILOVER_MODE = MANUAL, 
  AVAILABILITY_MODE = SYNCHRONOUS_COMMIT, 
  SEEDING_MODE = AUTOMATIC
)
LISTENER 'LAB1SQLAG1L1'
(
  WITH IP ( ('10.16.16.34','255.255.255.0') ) ,
  PORT = 1433
); 
GO

-- Run on LAB1SQL02
ALTER AVAILABILITY GROUP [LAB1SQLAG1] JOIN
--Allow the Availability Group to create databases on behalf of the primary replica
ALTER AVAILABILITY GROUP [LAB1SQLAG1] GRANT CREATE ANY DATABASE
GO

---------------------------------------------------------------------------
-- Create endpoints
-- Run on LAB1SQL03
USE [master]
GO
CREATE ENDPOINT [Endpoint_AG]
  STATE=STARTED
  AS TCP (LISTENER_PORT = 5022, LISTENER_IP = ALL)
  FOR DATA_MIRRORING (ROLE = ALL, AUTHENTICATION = WINDOWS NEGOTIATE
      , ENCRYPTION = REQUIRED ALGORITHM AES)
GO 

-- Run on LAB1SQL04
USE [master]
GO
CREATE ENDPOINT [Endpoint_AG]
  STATE=STARTED
  AS TCP (LISTENER_PORT = 5022, LISTENER_IP = ALL)
  FOR DATA_MIRRORING (ROLE = ALL, AUTHENTICATION = WINDOWS NEGOTIATE
      , ENCRYPTION = REQUIRED ALGORITHM AES)
GO

-- Create login and grant the SQL Server service account CONNECT permissions to the endpoint
-- Run on LAB1SQL03
USE master
GO
CREATE LOGIN [LAB1\svc.sql01.lab1$] FROM WINDOWS;
GO
GRANT CONNECT ON ENDPOINT::Endpoint_AG
TO [LAB1\svc.sql01.lab1$]; 
GO

-- Run on LAB1SQL04
USE master
GO
CREATE LOGIN [LAB1\svc.sql01.lab1$] FROM WINDOWS;
GO
GRANT CONNECT ON ENDPOINT::Endpoint_AG
TO [LAB1\svc.sql01.lab1$]; 
GO


-- Create an Availability Group LAB1SQLAG3
-- Run on LAB1SQL03
CREATE AVAILABILITY GROUP [LAB1SQLAG3]
FOR 
REPLICA ON
N'LAB1SQL03' WITH
(
  ENDPOINT_URL = N'TCP://LAB1SQL03.lab1.local:5022', 
  FAILOVER_MODE = MANUAL, 
  AVAILABILITY_MODE = SYNCHRONOUS_COMMIT, 
  SEEDING_MODE = AUTOMATIC
), 
N'LAB1SQL04' WITH
(  ENDPOINT_URL = N'TCP://LAB1SQL04.lab1.local:5022', 
  FAILOVER_MODE = MANUAL, 
  AVAILABILITY_MODE = SYNCHRONOUS_COMMIT, 
  SEEDING_MODE = AUTOMATIC
)
LISTENER 'LAB1SQLAG3L1'
(
  WITH IP ( ('10.16.16.54','255.255.255.0') ) ,
  PORT = 1433
); 
GO

-- Run on LAB1SQL04
ALTER AVAILABILITY GROUP [LAB1SQLAG3] JOIN
--Allow the Availability Group to create databases on behalf of the primary replica
ALTER AVAILABILITY GROUP [LAB1SQLAG3] GRANT CREATE ANY DATABASE
GO

然後,我通過在 LAB1SQL01(用於 AG LAB1SQLAG1)和 LAB1SQL03(用於 LAB1SQLAG3)上創建測試數據庫來測試 2 個 AG,並確認兩個數據庫都同步到各自的副本。然後我刪除了數據庫。

然後創建了以下分佈式可用性組:

  1. DistAG_AG1_AG3 - 從 LAB1SQLAG1 到 LAB1SQLAG3 的 Dist AG
--Run on LAB1SQL01
CREATE AVAILABILITY GROUP [DistAG_AG1_AG3] 
WITH (DISTRIBUTED)  
AVAILABILITY GROUP ON 
'LAB1SQLAG1' WITH   
(  
  LISTENER_URL = 'TCP://LAB1SQLAG1L1.LAB1.LOCAL:5022',   
  AVAILABILITY_MODE = ASYNCHRONOUS_COMMIT,  
  FAILOVER_MODE = MANUAL,  
  SEEDING_MODE = AUTOMATIC  
),  
'LAB1SQLAG3' WITH   
(  
  LISTENER_URL = 'TCP://LAB1SQLAG3L1.LAB1.LOCAL:5022',  
  AVAILABILITY_MODE = ASYNCHRONOUS_COMMIT,  
  FAILOVER_MODE = MANUAL,  
  SEEDING_MODE = AUTOMATIC  
);   
GO 

--Run on LAB1SQL03
ALTER AVAILABILITY GROUP [DistAG_AG1_AG3]  
JOIN  
AVAILABILITY GROUP ON 
'LAB1SQLAG1' WITH   
(  
  LISTENER_URL = 'TCP://LAB1SQLAG1L1.LAB1.LOCAL:5022',   
  AVAILABILITY_MODE = ASYNCHRONOUS_COMMIT,  
  FAILOVER_MODE = MANUAL,  
  SEEDING_MODE = AUTOMATIC  
),  
'LAB1SQLAG3' WITH   
(  
  LISTENER_URL = 'TCP://LAB1SQLAG3L1.LAB1.LOCAL:5022',  
  AVAILABILITY_MODE = ASYNCHRONOUS_COMMIT,  
  FAILOVER_MODE = MANUAL,  
  SEEDING_MODE = AUTOMATIC  
);   
GO    

但是這次我在LAB1SQL01上創建了一個測試數據庫,然後添加到本地AG LAB1SQLAG1,輔助AG發生了奇怪的事情。

上的數據庫出現 LAB1SQL03 但仍處於“同步/恢復中”狀態,並且它不會作為數據庫出現在 LAB1SQL04 上(但確實出現在可用性數據庫下,旁邊有一個星號。

LAB1SQL03 未播種 LAB1SQL04 未播種

當我查看 LAB1SQL04 上的日誌時,絕對沒有任何內容。當我查看 LAB1SQL03 上的日誌時,我可以看到數據庫正在恢復,然後出現以下錯誤:

Always On: DebugTraceVarArgs AR '[HADR] [Primary] operation on replicas [9DAB7230-0E72-450E-ADFC-ECCDAE1349E7]->[0B7ABD01-D534-41AB-AFEB-91F6A3D89D46], database [SQL01TestDB1], remote endpoint [TCP://LAB1SQL04.lab1.local:5022], source operation [497E843D-F993-49CD-9816-C3800DCF18B4]: Seeding encountered a transient failure, state '20', retrying...'
Always On: DebugTraceVarArgs AR '[HADR] [Primary] operation on replicas [9DAB7230-0E72-450E-ADFC-ECCDAE1349E7]->[0B7ABD01-D534-41AB-AFEB-91F6A3D89D46], database [SQL01TestDB1], remote endpoint [TCP://LAB1SQL04.lab1.local:5022], source operation [7781AE86-A91F-4523-9F3C-32D5A5201AE8]: Transitioning from [PENDING] to [CHECK_IF_SEEDING_NEEDED].'
Always On: DebugTraceVarArgs AR '[HADR] [Primary] operation on replicas [9DAB7230-0E72-450E-ADFC-ECCDAE1349E7]->[0B7ABD01-D534-41AB-AFEB-91F6A3D89D46], database [SQL01TestDB1], remote endpoint [TCP://LAB1SQL04.lab1.local:5022], source operation [7781AE86-A91F-4523-9F3C-32D5A5201AE8]: Seeding task failed with result 0x80040e19.'
Always On: DebugTraceVarArgs AR 'Seeding is canceled with cancelReason = 20'
Always On: DebugTraceVarArgs AR '[HADR] [Primary] operation on replicas [9DAB7230-0E72-450E-ADFC-ECCDAE1349E7]->[0B7ABD01-D534-41AB-AFEB-91F6A3D89D46], database [SQL01TestDB1], remote endpoint [TCP://LAB1SQL04.lab1.local:5022], source operation [7781AE86-A91F-4523-9F3C-32D5A5201AE8]: Transitioning from [CHECK_IF_SEEDING_NEEDED] to [FAILED].'
Always On: DebugTraceVarArgs AR '[HADR] [Primary] operation on replicas [9DAB7230-0E72-450E-ADFC-ECCDAE1349E7]->[0B7ABD01-D534-41AB-AFEB-91F6A3D89D46], database [SQL01TestDB1], remote endpoint [TCP://LAB1SQL04.lab1.local:5022], source operation [7781AE86-A91F-4523-9F3C-32D5A5201AE8]: Seeding encountered a transient failure, state '20', maximum retries exceeded'

我已經測試了上述配置 SQL 2016 AG 到 SQL 2016 AG,它執行良好。所以我不認為這是我的配置。所有伺服器都使用相同的 configfile.ini 建構並具有相同的磁碟佈局。我還嘗試在所有伺服器上禁用防火牆。

作業系統都是評估許可證。我從評估 SQL 許可證開始,認為這可能會導致問題,因此從工作中獲取企業副本進行測試,但它沒有幫助。

關於我需要做什麼才能使其正常工作的任何建議?我還沒有找到其他人使用我遇到的錯誤程式碼。

提前感謝所有閱讀到底部的人:-)

上的數據庫出現 LAB1SQL03 但仍處於“同步/恢復中”狀態,並且它不會作為數據庫出現在 LAB1SQL04 上(但確實出現在可用性數據庫下,旁邊有一個星號。

這基本上是設計使然,儘管感覺並非如此。讓我解釋。2016 上的數據庫是 2016 格式,它是線上可用的,它可以播種到普通 AG 中的所有輔助副本,也可以播種到轉發器,因為基礎數據庫來自主副本。當涉及到分佈式 AG 的轉發器端時,數據庫不可用,它正在等待數據庫升級到 2019,它不可讀或無法訪問,並且轉發端的輔助副本從轉發器獲取其數據庫副本,而不是global primary,不可用,因為需要先升級。

您的選擇是暫時在轉發器端手動播種輔助副本。

我做了從 SQL2016 AG 到 SQL2019 AG 的相同分佈式 AG 配置。我第一次使用 Active Directory 連接,我很不走運,雖然 sql 語句沒有錯誤,但沒有建立到輔助的連接。然後我用證書的方式連接了primary和forwarder,這次就好了。我可以看到連接已建立。我還注意到,從輔助分佈式副本中,數據庫狀態為 Synchronized/In Recovery。並且無法訪問數據庫。我認為這是從 sql2016 到 Sql2019 實現 DAG 時的正常行為。然後我做了failover測試,DAG成功failover到forwarder,從sql2019的secondary在failover後自動同步。所以我看到可以配置從 SQL2016 到 SQL 2019 的 DAG。

引用自:https://dba.stackexchange.com/questions/262031