在現有表上進行表分區並使用不同的文件組

December 24, 2021

我在主文件組上有一個現有表，我想對其進行分區。分區鍵在年份，這是一個計算列。我想以這樣的方式對錶進行分區，以便最終每年的數據都在它自己的文件組上。我首先想拆分 2 年，所以稍後我可以測試更多關於如何使用拆分命令拆分其他數據的內容。現在，我可以創建分區函式和方案，並且我還看到某年的數據在正確的分區中，但是我無法在正確的文件組中獲取物理數據。似乎數據仍駐留在該主文件組中。我嘗試重建索引，但這仍然沒有將數據移動到正確的文件組中。最後，該表將具有聚集列儲存索引，但我也嘗試使用聚集行儲存索引。我這樣做的原因是因為 SQL Server 似乎不允許列儲存索引拆分和合併非空分區（我嘗試了一些拆分和合併但結果相同的東西），所以我認為這至少可以工作。如果您有任何建議或意見，請在此填寫。順便說一句，我正在使用 SQL Server 2019。
現在對於程式碼，我使用 Stackoverflow2013 數據庫：
use StackOverflow2013;
go

-- Create file groups for partitions
alter database [StackOverflow2013]
add filegroup StackOverflow2013_2008;

ALTER DATABASE [StackOverflow2013]
   ADD FILE 
   (
   NAME = [StackOverflow2013_2008],
   FILENAME = 'E:\DATA\StackOverflow2013_2008.ndf',
       SIZE = 1024 KB, 
       MAXSIZE = UNLIMITED, 
       FILEGROWTH = 512 MB
   ) TO FILEGROUP [StackOverflow2013_2008]
   
alter database [StackOverflow2013]
add filegroup StackOverflow2013_2009;

ALTER DATABASE [StackOverflow2013]
   ADD FILE 
   (
   NAME = [StackOverflow2013_2009],
   FILENAME = 'E:\DATA\StackOverflow2013_2009.ndf',
       SIZE = 1024 KB, 
       MAXSIZE = UNLIMITED, 
       FILEGROWTH = 512 MB
   ) TO FILEGROUP [StackOverflow2013_2009]


-- Drop the current default index, we want to build one later on the partition key
ALTER TABLE [dbo].[Comments] DROP CONSTRAINT [PK_Comments_Id] WITH ( ONLINE = OFF )

-- Add partition key column
alter table [StackOverflow2013].[dbo].[Comments]
add [year] as (datepart(year, CreationDate));

go
-- Add partition function based on year 
-- For now we only want 2008 and 2009, other years will be migrated later to test with split function
create partition function fun_Comments(int)
as range left for values (2008, 2009);

-- Add partition scheme
create partition scheme scheme_Comments
as partition fun_Comments
to (StackOverflow2013_2008, StackOverflow2013_2009, [Primary]);

-- Check the partition numbers and who's next
SELECT DestinationId = DestinationDataSpaces.destination_id
   ,FilegroupName = Filegroups.name
   ,PartitionHighBoundaryValue = PartitionRangeValues.value
   ,IsNextUsed = CASE 
       WHEN DestinationDataSpaces.destination_id &gt; 1
           AND LAG(PartitionRangeValues.value, 1) OVER (
               ORDER BY DestinationDataSpaces.destination_id ASC
               ) IS NULL
           THEN 1
       ELSE 0
       END
FROM sys.partition_schemes AS PartitionSchemes
INNER JOIN sys.destination_data_spaces AS DestinationDataSpaces ON PartitionSchemes.data_space_id = DestinationDataSpaces.partition_scheme_id
INNER JOIN sys.filegroups AS Filegroups ON DestinationDataSpaces.data_space_id = Filegroups.data_space_id
LEFT OUTER JOIN sys.partition_range_values AS PartitionRangeValues ON PartitionSchemes.function_id = PartitionRangeValues.function_id
   AND DestinationDataSpaces.destination_id = PartitionRangeValues.boundary_id
WHERE PartitionSchemes.name = N'scheme_Comments'
ORDER BY DestinationId ASC;
檢查分區 1 的行
SELECT * FROM Comments 
WHERE $PARTITION.fun_Comments(year) = 1; 
檢查分區 2 的行
SELECT * FROM Comments 
WHERE $PARTITION.fun_Comments(year) = 2; 
檢查文件大小
（非常大的查詢）
– 創建新的聚集索引以正確分佈數據
create clustered index [CCIX_Comments] ON [dbo].[Comments] (year)
再次檢查文件大小
所以在我看來，所有數據實際上仍在主文件組中，因為新文件組是 emtpy。該表為 7 GB，因此我至少希望其中有一些數據。
所以基本上我的問題是，在這種情況下，如何正確地在文件組中的文件上重新分配數據？