Mongodb
Ranged Sharding - 形成十六進製字元串範圍背後的邏輯
我有一個分片集合,其中分片鍵是一個名為“uuid”的欄位。該欄位的值是字元串類型,表示十六進制值,即十六進製字元串。對於每個文件,這個“uuid”欄位是唯一的。
數據由 MongoDB 自動劃分為塊。我無法弄清楚 MongoDB 是如何將此十六進製字元串劃分為連續範圍的。沒有文件可以解釋 Mongo 如何形成這些範圍
你能幫我理解這些範圍是如何形成的嗎?
作為範例,我插入了 3025357 個具有上述十六進制值的文件。與它們相關的塊和範圍是,
{ "_id" : "database.sha_shard-uuid_MinKey", "lastmod" : Timestamp(2, 0), "lastmodEpoch" : ObjectId("5e08bad0b5e6b931087f0871"), "ns" : "database.sha_shard", "min" : { "uuid" : { "$minKey" : 1 } }, "max" : { "uuid" : "000043c071f23fc889275f77f950c649faac92e0" }, "shard" : "shardRpSet2", "history" : [ { "validAfter" : Timestamp(1577632842, 37), "shard" : "shardRpSet2" } ] },{ "_id" : "database.sha_shard-uuid_\"5b935a89d91977490d04f740a86bccc2b3cc2bfb\"", "lastmod" : Timestamp(3, 5), "lastmodEpoch" : ObjectId("5e08bad0b5e6b931087f0871"), "ns" : "database.sha_shard", "min" : { "uuid" : "5b935a89d91977490d04f740a86bccc2b3cc2bfb" }, "max" : { "uuid" : "7a25fa7aa3a86ed259f646d7890db370e8b43ae7" }, "shard" : "shardRpSet1", "history" : [ { "validAfter" : Timestamp(1577632856, 21509), "shard" : "shardRpSet1" } ] },{ "_id" : "database.sha_shard-uuid_\"7a25fa7aa3a86ed259f646d7890db370e8b43ae7\"", "lastmod" : Timestamp(3, 6), "lastmodEpoch" : ObjectId("5e08bad0b5e6b931087f0871"), "ns" : "database.sha_shard", "min" : { "uuid" : "7a25fa7aa3a86ed259f646d7890db370e8b43ae7" }, "max" : { "uuid" : "810b573464d4894fc40b428ec82ec54d9a681bf6" }, "shard" : "shardRpSet1", "history" : [ { "validAfter" : Timestamp(1577632856, 21509), "shard" : "shardRpSet1" } ] },{ "_id" : "database.sha_shard-uuid_\"000043c071f23fc889275f77f950c649faac92e0\"", "lastmod" : Timestamp(4, 0), "lastmodEpoch" : ObjectId("5e08bad0b5e6b931087f0871"), "ns" : "database.sha_shard", "min" : { "uuid" : "000043c071f23fc889275f77f950c649faac92e0" }, "max" : { "uuid" : "1e8421c5d4f3eb45a82c2785bccc81fa7abfbfc7" }, "shard" : "shardRpSet2", "history" : [ { "validAfter" : Timestamp(1577635896, 15268), "shard" : "shardRpSet2" } ] },{ "_id" : "database.sha_shard-uuid_\"1e8421c5d4f3eb45a82c2785bccc81fa7abfbfc7\"", "lastmod" : Timestamp(5, 0), "lastmodEpoch" : ObjectId("5e08bad0b5e6b931087f0871"), "ns" : "database.sha_shard", "min" : { "uuid" : "1e8421c5d4f3eb45a82c2785bccc81fa7abfbfc7" }, "max" : { "uuid" : "3d165990d2969bbaf79b6b0d790080b46ca5f056" }, "shard" : "shardRpSet", "history" : [ { "validAfter" : Timestamp(1577635906, 26457), "shard" : "shardRpSet" } ] },{ "_id" : "database.sha_shard-uuid_\"3d165990d2969bbaf79b6b0d790080b46ca5f056\"", "lastmod" : Timestamp(5, 1), "lastmodEpoch" : ObjectId("5e08bad0b5e6b931087f0871"), "ns" : "database.sha_shard", "min" : { "uuid" : "3d165990d2969bbaf79b6b0d790080b46ca5f056" }, "max" : { "uuid" : "5b935a89d91977490d04f740a86bccc2b3cc2bfb" }, "shard" : "shardRpSet1", "history" : [ { "validAfter" : Timestamp(1577632856, 21509), "shard" : "shardRpSet1" } ] },{ "_id" : "database.sha_shard-uuid_\"c1788722a31a5a5a5caa00816ad85aeeda26e581\"", "lastmod" : Timestamp(5, 2), "lastmodEpoch" : ObjectId("5e08bad0b5e6b931087f0871"), "ns" : "database.sha_shard", "min" : { "uuid" : "c1788722a31a5a5a5caa00816ad85aeeda26e581" }, "max" : { "uuid" : "dcbd245e03d425aa14a85b51befde274856fc5f3" }, "shard" : "shardRpSet", "history" : [ { "validAfter" : Timestamp(1577630416, 3), "shard" : "shardRpSet" } ] },{ "_id" : "database.sha_shard-uuid_\"dcbd245e03d425aa14a85b51befde274856fc5f3\"", "lastmod" : Timestamp(5, 3), "lastmodEpoch" : ObjectId("5e08bad0b5e6b931087f0871"), "ns" : "database.sha_shard", "min" : { "uuid" : "dcbd245e03d425aa14a85b51befde274856fc5f3" }, "max" : { "uuid" : "fffff8c5e160711fb48f0d38ce01a98880e869e2" }, "shard" : "shardRpSet", "history" : [ { "validAfter" : Timestamp(1577630416, 3), "shard" : "shardRpSet" } ] },{ "_id" : "database.sha_shard-uuid_\"fffff8c5e160711fb48f0d38ce01a98880e869e2\"", "lastmod" : Timestamp(6, 0), "lastmodEpoch" : ObjectId("5e08bad0b5e6b931087f0871"), "ns" : "database.sha_shard", "min" : { "uuid" : "fffff8c5e160711fb48f0d38ce01a98880e869e2" }, "max" : { "uuid" : { "$maxKey" : 1 } }, "shard" : "shardRpSet2", "history" : [ { "validAfter" : Timestamp(1577636268, 67), "shard" : "shardRpSet2" } ] },{ "_id" : "database.sha_shard-uuid_\"810b573464d4894fc40b428ec82ec54d9a681bf6\"", "lastmod" : Timestamp(6, 1), "lastmodEpoch" : ObjectId("5e08bad0b5e6b931087f0871"), "ns" : "database.sha_shard", "min" : { "uuid" : "810b573464d4894fc40b428ec82ec54d9a681bf6" }, "max" : { "uuid" : "c1788722a31a5a5a5caa00816ad85aeeda26e581" }, "shard" : "shardRpSet", "history" : [ { "validAfter" : Timestamp(1577630416, 3), "shard" : "shardRpSet" } ] }
有關分片塊如何工作的參考:https ://docs.mongodb.com/v4.0/core/sharding-data-partitioning/
MongoDB 使用與集合關聯的分片鍵將數據劃分為塊。塊由分片數據的子集組成。每個塊都有一個基於分片鍵的包含下限和互斥上限。
mongos 根據分片鍵值將寫入路由到適當的塊。當塊增長超過配置的塊大小時,MongoDB 會拆分塊。插入和更新都可以觸發塊拆分。
現在,要了解塊內將包含哪些記錄,我們需要了解**“每個塊具有基於分片鍵的包含下限和互斥上限”部分,從現在開始,我們應該將其稱為塊範圍**。
例如,這個塊:
{ "_id" : "database.sha_shard-uuid_MinKey", "lastmod" : Timestamp(2, 0), "lastmodEpoch" : ObjectId("5e08bad0b5e6b931087f0871"), "ns" : "database.sha_shard", "min" : { "uuid" : { "$minKey" : 1 } }, "max" : { "uuid" : "000043c071f23fc889275f77f950c649faac92e0" }, "shard" : "shardRpSet2", "history" : [ { "validAfter" : Timestamp(1577632842, 37), "shard" : "shardRpSet2" } ] }
欄位
min
和max
是塊範圍:"min" : { "uuid" : { "$minKey" : 1 } }, "max" : { "uuid" : "000043c071f23fc889275f77f950c649faac92e0" },
該範圍定義了塊內的內容,您可以通過閱讀 BSON 參考了解該範圍的工作原理:https ://docs.mongodb.com/v4.0/reference/bson-type-comparison-order/
在您的情況下,如果該
UUID
欄位僅包含字元串,這就是記錄將被評估為在塊範圍內的方式:字元串二進制比較
預設情況下,MongoDB 使用簡單的二進制比較來比較字元串。