Mongodb

db.collection.validate() 缺少輸出並且 verify() 在第一次執行時返回 EBUSY

  • January 12, 2017

成功執行的日誌

# grep conn103640 mongodb.log | head -10
2017-01-12T11:29:51.492+0000 I ACCESS   [conn103640] Successfully authenticated as principal tz6z on admin
2017-01-12T11:31:20.954+0000 I COMMAND  [conn103640] CMD: validate example.example
2017-01-12T11:31:20.970+0000 W STORAGE  [conn103640] verify() returned EBUSY. Not treating as invalid.
2017-01-12T11:46:26.307+0000 I INDEX    [conn103640] validating index example.example.$_id_
2017-01-12T11:46:26.608+0000 I STORAGE  [conn103640] WiredTiger progress WT_SESSION.verify 100
2017-01-12T11:46:26.802+0000 I STORAGE  [conn103640] WiredTiger progress WT_SESSION.verify 200
2017-01-12T11:46:26.964+0000 I STORAGE  [conn103640] WiredTiger progress WT_SESSION.verify 300
2017-01-12T11:46:27.208+0000 I STORAGE  [conn103640] WiredTiger progress WT_SESSION.verify 400
2017-01-12T11:46:27.389+0000 I STORAGE  [conn103640] WiredTiger progress WT_SESSION.verify 500
2017-01-12T11:46:27.636+0000 I STORAGE  [conn103640] WiredTiger progress WT_SESSION.verify 600

# grep conn103640 mongodb.log | tail -10
2017-01-12T13:03:27.090+0000 I STORAGE  [conn103640] WiredTiger progress WT_SESSION.verify 2700
2017-01-12T13:03:27.101+0000 I STORAGE  [conn103640] WiredTiger progress WT_SESSION.verify 2800
2017-01-12T13:03:27.112+0000 I STORAGE  [conn103640] WiredTiger progress WT_SESSION.verify 2900
2017-01-12T13:03:27.123+0000 I STORAGE  [conn103640] WiredTiger progress WT_SESSION.verify 3000
2017-01-12T13:03:27.133+0000 I STORAGE  [conn103640] WiredTiger progress WT_SESSION.verify 3100
2017-01-12T13:03:27.146+0000 I STORAGE  [conn103640] WiredTiger progress WT_SESSION.verify 3200
2017-01-12T13:03:27.158+0000 I STORAGE  [conn103640] WiredTiger progress WT_SESSION.verify 3300
2017-01-12T13:03:27.181+0000 I STORAGE  [conn103640] WiredTiger progress WT_SESSION.verify 3400
2017-01-12T13:03:27.198+0000 I STORAGE  [conn103640] WiredTiger progress WT_SESSION.verify 3500
2017-01-12T13:03:27.951+0000 I COMMAND  [conn103640] command example.$cmd command: validate { validate: "example", full: true } keyUpdates:0 writeConflicts:0 numYields:0 reslen:536 locks:{ Global: { acquireCount: { r: 3, w: 1 } }, Database: { acquireCount: { r: 1, w: 1 } }, Collection: { acquireCount: { r: 1, W: 1 } } } protocol:op_command 3884708ms

第一次執行失敗

{
       "ns" : "example.example",
       "nrecords" : 3087895,
       "nIndexes" : 3,
       "keysPerIndex" : {
               "example.example.example.$_id_" : 3087895,
               "example.example.example.$files_id_1_n_1" : 3087895,
               "example.example.$files_id_1" : 3087895
       },
       "indexDetails" : {
               "example.example.$_id_" : {
                       "valid" : true
               },
               "example.example.$files_id_1_n_1" : {
                       "valid" : true
               },
               "example.example.$files_id_1" : {
                       "valid" : true
               }
       },
       "valid" : true,
       "errors" : [
               "verify() returned EBUSY. Not treating as invalid."
       ],
       "ok" : 1
}

為什麼verify() returned EBUSY. Not treating as invalid?為什麼會發生此錯誤,這是什麼意思?

> db.example.validate(true)
{
       "ns" : "example.example",
       "nrecords" : 3087895,
       "nIndexes" : 3,
       "keysPerIndex" : {
               "example.example.$_id_" : 3087895,
               "example.example.$files_id_1_n_1" : 3087895,
               "example.example.$files_id_1" : 3087895
       },
       "indexDetails" : {
               "example.example.$_id_" : {
                       "valid" : true
               },
               "example.example.$files_id_1_n_1" : {
                       "valid" : true
               },
               "example.example.$files_id_1" : {
                       "valid" : true
               }
       },
       "valid" : true,
       "errors" : [ ],
       "ok" : 1
}

我們在分片集群設置中使用 3.2.11 版本。我們缺少很多來自文件的輸出。

例如

  • validate.deletedSize
  • validate.deletedCount
  • validate.keysPerIndex
  • validate.firstExtentDetails

為什麼?

這裡是統計數據

> db.getCollection('example').stats()
{
       "ns" : "example.example",
       "count" : 3093574,
       "size" : 41265087190,
       "avgObjSize" : 13338,
       "storageSize" : 44227588096,
       "capped" : false,
       "wiredTiger" : {
               "metadata" : {
                       "formatVersion" : 1
               },
               "creationString" : "allocation_size=4KB,app_metadata=(formatVersion=1),block_allocation=best,block_compressor=snappy,cache_resident=false,checksum=on,colgroups=,collator=,columns=,dictionary=0,encryption=(keyid=,name=),exclusive=false,extractor=,format=btree,huffman_key=,huffman_value=,ignore_in_memory_cache_size=false,immutable=false,internal_item_max=0,internal_key_max=0,internal_key_truncate=true,internal_page_max=4KB,key_format=q,key_gap=10,leaf_item_max=0,leaf_key_max=0,leaf_page_max=32KB,leaf_value_max=64MB,log=(enabled=true),lsm=(auto_throttle=true,bloom=true,bloom_bit_count=16,bloom_config=,bloom_hash_count=8,bloom_oldest=false,chunk_count_limit=0,chunk_max=5GB,chunk_size=10MB,merge_max=15,merge_min=0),memory_page_max=10m,os_cache_dirty_max=0,os_cache_max=0,prefix_compression=false,prefix_compression_min=4,source=,split_deepen_min_child=0,split_deepen_per_child=0,split_pct=90,type=file,value_format=u",
               "type" : "file",
               "uri" : "statistics:table:example/collection/3--7826890430442575271",
               "LSM" : {
                       "bloom filter false positives" : 0,
                       "bloom filter hits" : 0,
                       "bloom filter misses" : 0,
                       "bloom filter pages evicted from cache" : 0,
                       "bloom filter pages read into cache" : 0,
                       "bloom filters in the LSM tree" : 0,
                       "chunks in the LSM tree" : 0,
                       "highest merge generation in the LSM tree" : 0,
                       "queries that could have benefited from a Bloom filter that did not exist" : 0,
                       "sleep for LSM checkpoint throttle" : 0,
                       "sleep for LSM merge throttle" : 0,
                       "total size of bloom filters" : 0
               },
               "block-manager" : {
                       "allocations requiring file extension" : 569147,
                       "blocks allocated" : 1199781,
                       "blocks freed" : 609087,
                       "checkpoint size" : 44227502080,
                       "file allocation unit size" : 4096,
                       "file bytes available for reuse" : 69632,
                       "file magic number" : 120897,
                       "file major version number" : 1,
                       "file size in bytes" : 44227588096,
                       "minor version number" : 0
               },
               "btree" : {
                       "btree checkpoint generation" : 20780,
                       "column-store fixed-size leaf pages" : 0,
                       "column-store internal pages" : 0,
                       "column-store variable-size RLE encoded values" : 0,
                       "column-store variable-size deleted values" : 0,
                       "column-store variable-size leaf pages" : 0,
                       "fixed-record size" : 0,
                       "maximum internal page key size" : 368,
                       "maximum internal page size" : 4096,
                       "maximum leaf page key size" : 2867,
                       "maximum leaf page size" : 32768,
                       "maximum leaf page value size" : 67108864,
                       "maximum tree depth" : 5,
                       "number of key/value pairs" : 0,
                       "overflow pages" : 0,
                       "pages rewritten by compaction" : 0,
                       "row-store internal pages" : 0,
                       "row-store leaf pages" : 0
               },
               "cache" : {
                       "bytes currently in the cache" : 14869343433,
                       "bytes read into cache" : NumberLong("1459320699070073"),
                       "bytes written from cache" : 29038851892,
                       "checkpoint blocked page eviction" : 0,
                       "data source pages selected for eviction unable to be evicted" : 4135,
                       "hazard pointer blocked page eviction" : 1009,
                       "in-memory page passed criteria to be split" : 6665,
                       "in-memory page splits" : 3325,
                       "internal pages evicted" : 27664,
                       "internal pages split during eviction" : 92,
                       "leaf pages split during eviction" : 3640,
                       "modified pages evicted" : 587435,
                       "overflow pages read into cache" : 0,
                       "overflow values cached in memory" : 0,
                       "page split during eviction deepened the tree" : 0,
                       "page written requiring lookaside records" : 0,
                       "pages read into cache" : 6996594,
                       "pages read into cache requiring lookaside entries" : 0,
                       "pages requested from the cache" : 27857673,
                       "pages written from cache" : 1184917,
                       "pages written requiring in-memory restoration" : 2,
                       "unmodified pages evicted" : 4714196
               },
               "cache_walk" : {
                       "Average difference between current eviction generation when the page was last considered" : 0,
                       "Average on-disk page image size seen" : 0,
                       "Clean pages currently in cache" : 0,
                       "Current eviction generation" : 0,
                       "Dirty pages currently in cache" : 0,
                       "Entries in the root page" : 0,
                       "Internal pages currently in cache" : 0,
                       "Leaf pages currently in cache" : 0,
                       "Maximum difference between current eviction generation when the page was last considered" : 0,
                       "Maximum page size seen" : 0,
                       "Minimum on-disk page image size seen" : 0,
                       "On-disk page image sizes smaller than a single allocation unit" : 0,
                       "Pages created in memory and never written" : 0,
                       "Pages currently queued for eviction" : 0,
                       "Pages that could not be queued for eviction" : 0,
                       "Refs skipped during cache traversal" : 0,
                       "Size of the root page" : 0,
                       "Total number of pages currently in cache" : 0
               },
               "compression" : {
                       "compressed pages read" : 320144,
                       "compressed pages written" : 61726,
                       "page written failed to compress" : 1085090,
                       "page written was too small to compress" : 38101,
                       "raw compression call failed, additional data available" : 0,
                       "raw compression call failed, no additional data available" : 0,
                       "raw compression call succeeded" : 0
               },
               "cursor" : {
                       "bulk-loaded cursor-insert calls" : 0,
                       "create calls" : 1682,
                       "cursor-insert key and value bytes inserted" : 28740274252,
                       "cursor-remove key bytes removed" : 4341036,
                       "cursor-update value bytes updated" : 0,
                       "insert calls" : 2151614,
                       "next calls" : 9257008,
                       "prev calls" : 1,
                       "remove calls" : 1085226,
                       "reset calls" : 7027759,
                       "restarted searches" : 4058,
                       "search calls" : 3792524,
                       "search near calls" : 0,
                       "truncate calls" : 0,
                       "update calls" : 0
               },
               "reconciliation" : {
                       "dictionary matches" : 0,
                       "fast-path pages deleted" : 0,
                       "internal page key bytes discarded using suffix compression" : 2041847,
                       "internal page multi-block writes" : 12480,
                       "internal-page overflow keys" : 0,
                       "leaf page key bytes discarded using prefix compression" : 0,
                       "leaf page multi-block writes" : 9106,
                       "leaf-page overflow keys" : 0,
                       "maximum blocks required for a page" : 160,
                       "overflow values written" : 0,
                       "page checksum matches" : 1261346,
                       "page reconciliation calls" : 626355,
                       "page reconciliation calls for eviction" : 4518,
                       "pages deleted" : 586775
               },
               "session" : {
                       "object compaction" : 0,
                       "open cursor count" : 22
               },
               "transaction" : {
                       "update conflicts" : 0
               }
       },
       "nindexes" : 3,
       "totalIndexSize" : 143032320,
       "indexSizes" : {
               "_id_" : 47726592,
               "files_id_1_n_1" : 50769920,
               "files_id_1" : 44535808
       },
       "ok" : 1
}

db.collection.validate() 不修復是否正確?只讀?

為什麼 verify() returned EBUSY. Not treating as invalid?為什麼會發生此錯誤,這是什麼意思?

這是一個通用警告,表示無法讀取 WiredTiger 文件,因為它正被另一個操作獨占使用(例如,如果檢查點碰巧在驗證文件的同時更新了文件)。“不視為無效”消息非常不透明,但試圖描述“不是成功但不是錯誤”(文件很忙)。這應該是一個暫時性錯誤,validate如果您重試,該命令可能會成功。我在 MongoDB 問題跟踪器中創建了SERVER-27670,以使此消息更加使用者友好。

我們在分片集群設置中使用 3.2.11 版本。 我們缺少很多來自 文件的輸出。

驗證輸出詳細資訊將根據儲存引擎、MongoDB 伺服器版本和配置而有所不同。MongoDB 3.4 文件中更好地描述了通用和儲存引擎特定的驗證欄位輸出 ,但尚未針對 3.2/3.0 進行更新(我為此創建了DOCS-9766)。

我可以在您的範例輸出中看到keysPerIndex(這是一個常見欄位)。範圍和刪除/空閒列表詳細資訊特定於 MMAPv1 儲存引擎,因此不包括在內,因為您使用的是 WiredTiger。

db.collection.validate() 不修復是否正確?只讀?

驗證行為是特定於儲存引擎的,但通常validate會遍歷資料結構並且不會嘗試修復任何錯誤。在 MongoDB 3.4 中,validate只讀的一個例外是 WiredTiger 儲存。某些 WiredTiger 收集統計資訊(例如文件計數)在不正常關閉的情況下可能不准確,並且將在驗證收集時更正。收集統計數據是元數據,而不是數據文件的結構更改。

如果需要修復/打撈數據,相關命令為repairDatabase . 一般來說,您應該考慮這是最後的手段,因為許多形式的數據損壞是不可修復的,最終結果是挽救可以成功讀取的數據。對於生產部署,修復損壞數據庫的第一個方法應該是從同一副本集的健康成員重新同步。

引用自:https://dba.stackexchange.com/questions/160871