MongoDB 数据文件损坏修复救命repair与致命危险

来源:这里教程网 时间:2026-03-03 16:20:15 作者:

    最近,一客户单实例mongodb数据库,没有备份的情况下遇到了断电导致的数据文件损坏,由于客户业务需要 及数据的不敏感性,要求尽快恢复业务,使用了Mongdb的自动修复repair命令进行修复。可喜的是,帮助用户尽快恢复了服务,可悲的是在客户可接受情况下相关数据文件内的数据丢失。这里,对这一过程做个总结,同时说明repair后为什么数据丢失。

    正常的mongodb数据查询

> show dbs;admin       0.000GBconfig      0.000GBdns_testdb  0.009GBlocal       0.000GB> use dns_testdbswitched to db dns_testdb> db.test_collection.find();{ "_id" : ObjectId("5fedd03d9d2569ee04ab62e1"), "name" : "elephant", "user_id" : 0, "boolean" : false, "added_at" : ISODate("2020-12-31T13:21:01.226Z"), "number" : 5129 }{ "_id" : ObjectId("5fedd03d9d2569ee04ab62e2"), "name" : "dog", "user_id" : 1, "boolean" : false, "added_at" : ISODate("2020-12-31T13:21:01.237Z"), "number" : 9699 }{ "_id" : ObjectId("5fedd03d9d2569ee04ab62e3"), "name" : "lion", "user_id" : 2, "boolean" : false, "added_at" : ISODate("2020-12-31T13:21:01.238Z"), "number" : 1783 }Type "it" for more> 2.模拟数据文件损坏 [mongo@centos7 dns_testdb]$ du -sh * 28M collection-8--6736947369024546614.wt 9.5M index-9--6736947369024546614.wt [mongo@centos7 dns_testdb]$  [mongo@centos7 dns_testdb]$  [mongo@centos7 dns_testdb]$ pwd /opt/mongo/data/single/dns_testdb [mongo@centos7 dns_testdb]$ dd if=/dev/null of=/opt/mongo/data/single/dns_testdb/collection-8--6736947369024546614.wt  bs=1024k count=5 0+0 records in 0+0 records out 0 bytes (0 B) copied, 0.000132203 s, 0.0 kB/s [mongo@centos7 dns_testdb]$3.重新启动mongodb

> use admin
switched to db admin
> db.shutdownServer();
[mongo@centos7 data]$ mongod --dbpath /opt/mongo/data/single --port 50001  --oplogSize 512  --fork --bind_ip 0.0.0.0 --logpath /opt/mongo/logs/single.log --logappend --journal --directoryperdb --profile=1
about to fork child process, waiting until server is ready for connections.
forked process: 102882
child process started successfully, parent exiting

4.虽然mongodb进程能启动,但是数据文件损坏后的数据集合做数据操作会导致mongod挂掉 [mongo@centos7 data]$ mongo --port 50001 MongoDB shell version v4.2.3 connecting to: mongodb://127.0.0.1:50001/?compressors=disabled&gssapiServiceName=mongodb Implicit session: session { "id" : UUID("09b6c6aa-059d-4a41-9e0d-e6553966399b") } MongoDB server version: 4.2.3 Server has startup warnings:  > show dbs; admin       0.000GB config      0.000GB dns_testdb  0.037GB local       0.000GB > use dns_testdb; switched to db dns_testdb> db.test_collection.find();2020-12-31T08:43:45.115-0500 I  NETWORK  [js] DBClientConnection failed to receive message from 127.0.0.1:50001 - HostUnreachable: Connection closed by peerError: error doing query: failed: network error while attempting to run command 'find' on host '127.0.0.1:50001' 2020-12-31T08:43:45.118-0500 I  NETWORK  [js] trying reconnect to 127.0.0.1:50001 failed2020-12-31T08:43:45.118-0500 I  NETWORK  [js] reconnect 127.0.0.1:50001 failed failed > 5.观察mongodb日志,提示数据文件损坏并建议使用repair进行修复

2020-12-31T08:43:45.103-0500 E  STORAGE  [conn1] WiredTiger error (-31802) [1609422225:103947][102882:0x7f96713b5700], file:dns_testdb/collection-8--6736947369024546614.wt, WT_SESSION.open_cursor: __desc_read, 351: dns_testdb/collection-8--6736947369024546614.wt does not appear to be a WiredTiger file: WT_ERROR: non-specific WiredTiger error Raw: [1609422225:103947][102882:0x7f96713b5700], file:dns_testdb/collection-8--6736947369024546614.wt, WT_SESSION.open_cursor: __desc_read, 351: dns_testdb/collection-8--6736947369024546614.wt does not appear to be a WiredTiger file: WT_ERROR: non-specific WiredTiger error
2020-12-31T08:43:45.104-0500 E  STORAGE  [conn1] Failed to open a WiredTiger cursor. Reason: UnknownError: -31802: WT_ERROR: non-specific WiredTiger error, uri: table:dns_testdb/collection-8--6736947369024546614, config: 
2020-12-31T08:43:45.104-0500 E  STORAGE  [conn1] This may be due to data corruption. Please read the documentation for starting MongoDB with --repair here: http://dochub.mongodb.org/core/repair
2020-12-31T08:43:45.104-0500 F  -        [conn1] Fatal Assertion 50882 at src/mongo/db/storage/wiredtiger/wiredtiger_session_cache.cpp 101
2020-12-31T08:43:45.104-0500 F  -        [conn1] 
***aborting after fassert() failure

6.按照mongod日志就行修复数据库

[mongo@centos7 data]$ mongod --dbpath /opt/mongo/data/single --port 50001  --oplogSize 512  --fork --bind_ip 0.0.0.0 --logpath /opt/mongo/logs/single.log --logappend --journal --directoryperdb --profile=1 --repair
about to fork child process, waiting until server is ready for connections.
forked process: 102942
child process started successfully, parent exiting
[mongo@centos7 data]$

7.修复过程中,mongod日志提示相关损坏的数据集合及索引被重建

2020-12-31T08:44:45.646-0500 I  STORAGE  [initandlisten] repairDatabase dns_testdb
2020-12-31T08:44:45.646-0500 I  STORAGE  [initandlisten] Repairing collection dns_testdb.test_collection
2020-12-31T08:44:45.647-0500 E  STORAGE  [initandlisten] WiredTiger error (-31802) [1609422285:647413][102942:0x7fca99ec8c40], file:dns_testdb/collection-8--6736947369024546614.wt, WT_SESSION.verify: __desc_read, 351: dns_testdb/collection-8--6736947369024546614.wt does not appear to be a WiredTiger file: WT_ERROR: non-specific WiredTiger error Raw: [1609422285:647413][102942:0x7fca99ec8c40], file:dns_testdb/collection-8--6736947369024546614.wt, WT_SESSION.verify: __desc_read, 351: dns_testdb/collection-8--6736947369024546614.wt does not appear to be a WiredTiger file: WT_ERROR: non-specific WiredTiger error
2020-12-31T08:44:45.647-0500 I  STORAGE  [initandlisten] Verify failed on uri table:dns_testdb/collection-8--6736947369024546614. Running a salvage operation.
2020-12-31T08:44:45.647-0500 E  STORAGE  [initandlisten] WiredTiger error (-31802) [1609422285:647930][102942:0x7fca99ec8c40], file:dns_testdb/collection-8--6736947369024546614.wt, WT_SESSION.salvage: __desc_read, 351: dns_testdb/collection-8--6736947369024546614.wt does not appear to be a WiredTiger file: WT_ERROR: non-specific WiredTiger error Raw: [1609422285:647930][102942:0x7fca99ec8c40], file:dns_testdb/collection-8--6736947369024546614.wt, WT_SESSION.salvage: __desc_read, 351: dns_testdb/collection-8--6736947369024546614.wt does not appear to be a WiredTiger file: WT_ERROR: non-specific WiredTiger error
2020-12-31T08:44:45.648-0500 W  STORAGE  [initandlisten] Salvage failed for uri table:dns_testdb/collection-8--6736947369024546614: Salvage failed: -31802: WT_ERROR: non-specific WiredTiger error. The file will be moved out of the way and a new ident will be created.
2020-12-31T08:44:45.648-0500 W  STORAGE  [initandlisten] Moving data file /opt/mongo/data/single/dns_testdb/collection-8--6736947369024546614.wt to backup as /opt/mongo/data/single/dns_testdb/collection-8--6736947369024546614.wt.corrupt
2020-12-31T08:44:45.648-0500 W  STORAGE  [initandlisten] Rebuilding ident dns_testdb/collection-8--6736947369024546614
2020-12-31T08:44:45.708-0500 I  STORAGE  [initandlisten] Successfully re-created table:dns_testdb/collection-8--6736947369024546614.
2020-12-31T08:44:45.718-0500 I  INDEX    [initandlisten] index build: starting on dns_testdb.test_collection properties: { v: 2, key: { _id: 1 }, name: "_id_", ns: "dns_testdb.test_collection" } using method: Foreground
2020-12-31T08:44:45.718-0500 I  INDEX    [initandlisten] build may temporarily use up to 200 megabytes of RAM
2020-12-31T08:44:45.718-0500 I  STORAGE  [initandlisten] Index build initialized: 2ddee833-ea97-4964-98c0-7137e71a99c9: dns_testdb.test_collection: indexes: 1
2020-12-31T08:44:45.722-0500 I  STORAGE  [initandlisten] Index builds manager starting: 2ddee833-ea97-4964-98c0-7137e71a99c9: dns_testdb.test_collection
2020-12-31T08:44:45.724-0500 I  INDEX    [initandlisten] index build: inserted 0 keys from external sorter into index in 0 seconds
2020-12-31T08:44:45.727-0500 I  INDEX    [initandlisten] index build: done building index _id_ on ns dns_testdb.test_collection
2020-12-31T08:44:45.727-0500 I  STORAGE  [initandlisten] Index builds manager completed successfully: 2ddee833-ea97-4964-98c0-7137e71a99c9: dns_testdb.test_collection. Index specs requested: 1. Indexes in catalog before build: 1. Indexes in catalog after build: 1

8.修复后重启mongod服务

[mongo@centos7 data]$ mongod --dbpath /opt/mongo/data/single --port 50001  --oplogSize 512  --fork --bind_ip 0.0.0.0 --logpath /opt/mongo/logs/single.log --logappend --journal --directoryperdb --profile=1 
about to fork child process, waiting until server is ready for connections.
forked process: 102975
child process started successfully, parent exiting
[mongo@centos7 data]$

9.mongod服务启动后,服务接受正常的数据查询,但是修复后,发生数据文件损坏的集合数据已经丢失

[mongo@centos7 data]$ mongo --port 50001
MongoDB shell version v4.2.3
connecting to: mongodb://127.0.0.1:50001/?compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("d88894c4-16bf-4013-a993-d29e2493fbdf") }
MongoDB server version: 4.2.3
Server has startup warnings: 
> show dbs;
admin       0.000GB
config      0.000GB
dns_testdb  0.000GB
local       0.000GB
> use dns_testdb;
switched to db dns_testdb
> db.test_collection.find();
>

10.总结    mongodb数据库修复命令repair,在无备份且发生数据文件损坏的情况下,会导致损坏数据文件相关集合数据全部丢失,但是修复后不妨碍mongod服务的正常启动。结合修改过程的日志,不难看出,repair对损坏的数据文件及相关集合的索引文件进行了重建,重建后的数据文件和集合文件被重新初始化,因此数据丢失。所以,使用mongodb数据库,最好合理配合使用mongodb的副本集做数据冗余安全策略,在使用mongodb副本集的同时还可以做个延迟同步节点防止误操作。

相关推荐