Mongodb

Ranged Sharding - 形成十六進製字元串範圍背後的邏輯

  • January 1, 2020

我有一個分片集合,其中分片鍵是一個名為“uuid”的欄位。該欄位的值是字元串類型,表示十六進制值,即十六進製字元串。對於每個文件,這個“uuid”欄位是唯一的。

數據由 MongoDB 自動劃分為塊。我無法弄清楚 MongoDB 是如何將此十六進製字元串劃分為連續範圍的。沒有文件可以解釋 Mongo 如何形成這些範圍

你能幫我理解這些範圍是如何形成的嗎?

作為範例,我插入了 3025357 個具有上述十六進制值的文件。與它們相關的塊和範圍是,

{    
   "_id" : "database.sha_shard-uuid_MinKey",
   "lastmod" : Timestamp(2, 0),
   "lastmodEpoch" : ObjectId("5e08bad0b5e6b931087f0871"),
   "ns" : "database.sha_shard",
   "min" : {
       "uuid" : { "$minKey" : 1 }
   },
   "max" : {
       "uuid" : "000043c071f23fc889275f77f950c649faac92e0"
   },
   "shard" : "shardRpSet2",
   "history" : [ 
       {
           "validAfter" : Timestamp(1577632842, 37),
           "shard" : "shardRpSet2"
       }
   ]
},{
   "_id" : "database.sha_shard-uuid_\"5b935a89d91977490d04f740a86bccc2b3cc2bfb\"",
   "lastmod" : Timestamp(3, 5),
   "lastmodEpoch" : ObjectId("5e08bad0b5e6b931087f0871"),
   "ns" : "database.sha_shard",
   "min" : {
       "uuid" : "5b935a89d91977490d04f740a86bccc2b3cc2bfb"
   },
   "max" : {
       "uuid" : "7a25fa7aa3a86ed259f646d7890db370e8b43ae7"
   },
   "shard" : "shardRpSet1",
   "history" : [ 
       {
           "validAfter" : Timestamp(1577632856, 21509),
           "shard" : "shardRpSet1"
       }
   ]
},{
   "_id" : "database.sha_shard-uuid_\"7a25fa7aa3a86ed259f646d7890db370e8b43ae7\"",
   "lastmod" : Timestamp(3, 6),
   "lastmodEpoch" : ObjectId("5e08bad0b5e6b931087f0871"),
   "ns" : "database.sha_shard",
   "min" : {
       "uuid" : "7a25fa7aa3a86ed259f646d7890db370e8b43ae7"
   },
   "max" : {
       "uuid" : "810b573464d4894fc40b428ec82ec54d9a681bf6"
   },
   "shard" : "shardRpSet1",
   "history" : [ 
       {
           "validAfter" : Timestamp(1577632856, 21509),
           "shard" : "shardRpSet1"
       }
   ]
},{
   "_id" : "database.sha_shard-uuid_\"000043c071f23fc889275f77f950c649faac92e0\"",
   "lastmod" : Timestamp(4, 0),
   "lastmodEpoch" : ObjectId("5e08bad0b5e6b931087f0871"),
   "ns" : "database.sha_shard",
   "min" : {
       "uuid" : "000043c071f23fc889275f77f950c649faac92e0"
   },
   "max" : {
       "uuid" : "1e8421c5d4f3eb45a82c2785bccc81fa7abfbfc7"
   },
   "shard" : "shardRpSet2",
   "history" : [ 
       {
           "validAfter" : Timestamp(1577635896, 15268),
           "shard" : "shardRpSet2"
       }
   ]
},{
   "_id" : "database.sha_shard-uuid_\"1e8421c5d4f3eb45a82c2785bccc81fa7abfbfc7\"",
   "lastmod" : Timestamp(5, 0),
   "lastmodEpoch" : ObjectId("5e08bad0b5e6b931087f0871"),
   "ns" : "database.sha_shard",
   "min" : {
       "uuid" : "1e8421c5d4f3eb45a82c2785bccc81fa7abfbfc7"
   },
   "max" : {
       "uuid" : "3d165990d2969bbaf79b6b0d790080b46ca5f056"
   },
   "shard" : "shardRpSet",
   "history" : [ 
       {
           "validAfter" : Timestamp(1577635906, 26457),
           "shard" : "shardRpSet"
       }
   ]
},{
   "_id" : "database.sha_shard-uuid_\"3d165990d2969bbaf79b6b0d790080b46ca5f056\"",
   "lastmod" : Timestamp(5, 1),
   "lastmodEpoch" : ObjectId("5e08bad0b5e6b931087f0871"),
   "ns" : "database.sha_shard",
   "min" : {
       "uuid" : "3d165990d2969bbaf79b6b0d790080b46ca5f056"
   },
   "max" : {
       "uuid" : "5b935a89d91977490d04f740a86bccc2b3cc2bfb"
   },
   "shard" : "shardRpSet1",
   "history" : [ 
       {
           "validAfter" : Timestamp(1577632856, 21509),
           "shard" : "shardRpSet1"
       }
   ]
},{
   "_id" : "database.sha_shard-uuid_\"c1788722a31a5a5a5caa00816ad85aeeda26e581\"",
   "lastmod" : Timestamp(5, 2),
   "lastmodEpoch" : ObjectId("5e08bad0b5e6b931087f0871"),
   "ns" : "database.sha_shard",
   "min" : {
       "uuid" : "c1788722a31a5a5a5caa00816ad85aeeda26e581"
   },
   "max" : {
       "uuid" : "dcbd245e03d425aa14a85b51befde274856fc5f3"
   },
   "shard" : "shardRpSet",
   "history" : [ 
       {
           "validAfter" : Timestamp(1577630416, 3),
           "shard" : "shardRpSet"
       }
   ]
},{
   "_id" : "database.sha_shard-uuid_\"dcbd245e03d425aa14a85b51befde274856fc5f3\"",
   "lastmod" : Timestamp(5, 3),
   "lastmodEpoch" : ObjectId("5e08bad0b5e6b931087f0871"),
   "ns" : "database.sha_shard",
   "min" : {
       "uuid" : "dcbd245e03d425aa14a85b51befde274856fc5f3"
   },
   "max" : {
       "uuid" : "fffff8c5e160711fb48f0d38ce01a98880e869e2"
   },
   "shard" : "shardRpSet",
   "history" : [ 
       {
           "validAfter" : Timestamp(1577630416, 3),
           "shard" : "shardRpSet"
       }
   ]
},{
   "_id" : "database.sha_shard-uuid_\"fffff8c5e160711fb48f0d38ce01a98880e869e2\"",
   "lastmod" : Timestamp(6, 0),
   "lastmodEpoch" : ObjectId("5e08bad0b5e6b931087f0871"),
   "ns" : "database.sha_shard",
   "min" : {
       "uuid" : "fffff8c5e160711fb48f0d38ce01a98880e869e2"
   },
   "max" : {
       "uuid" : { "$maxKey" : 1 }
   },
   "shard" : "shardRpSet2",
   "history" : [ 
       {
           "validAfter" : Timestamp(1577636268, 67),
           "shard" : "shardRpSet2"
       }
   ]
},{
   "_id" : "database.sha_shard-uuid_\"810b573464d4894fc40b428ec82ec54d9a681bf6\"",
   "lastmod" : Timestamp(6, 1),
   "lastmodEpoch" : ObjectId("5e08bad0b5e6b931087f0871"),
   "ns" : "database.sha_shard",
   "min" : {
       "uuid" : "810b573464d4894fc40b428ec82ec54d9a681bf6"
   },
   "max" : {
       "uuid" : "c1788722a31a5a5a5caa00816ad85aeeda26e581"
   },
   "shard" : "shardRpSet",
   "history" : [ 
       {
           "validAfter" : Timestamp(1577630416, 3),
           "shard" : "shardRpSet"
       }
   ]
}

有關分片塊如何工作的參考:https ://docs.mongodb.com/v4.0/core/sharding-data-partitioning/

MongoDB 使用與集合關聯的分片鍵將數據劃分為塊。塊由分片數據的子集組成。每個塊都有一個基於分片鍵的包含下限和互斥上限。

mongos 根據分片鍵值將寫入路由到適當的塊。當塊增長超過配置的塊大小時,MongoDB 會拆分塊。插入和更新都可以觸發塊拆分。

現在,要了解塊內將包含哪些記錄,我們需要了解**“每個塊具有基於分片鍵的包含下限和互斥上限”部分,從現在開始,我們應該將其稱為塊範圍**。

例如,這個塊:

{    
   "_id" : "database.sha_shard-uuid_MinKey",
   "lastmod" : Timestamp(2, 0),
   "lastmodEpoch" : ObjectId("5e08bad0b5e6b931087f0871"),
   "ns" : "database.sha_shard",
   "min" : {
       "uuid" : { "$minKey" : 1 }
   },
   "max" : {
       "uuid" : "000043c071f23fc889275f77f950c649faac92e0"
   },
   "shard" : "shardRpSet2",
   "history" : [ 
       {
           "validAfter" : Timestamp(1577632842, 37),
           "shard" : "shardRpSet2"
       }
   ]
}

欄位minmax是塊範圍:

"min" : {
    "uuid" : { "$minKey" : 1 }
},
"max" : {
    "uuid" : "000043c071f23fc889275f77f950c649faac92e0"
},

該範圍定義了塊內的內容,您可以通過閱讀 BSON 參考了解該範圍的工作原理:https ://docs.mongodb.com/v4.0/reference/bson-type-comparison-order/

在您的情況下,如果該UUID欄位僅包含字元串,這就是記錄將被評估為在塊範圍內的方式:

字元串二進制比較

預設情況下,MongoDB 使用簡單的二進制比較來比較字元串。

引用自:https://dba.stackexchange.com/questions/256438