今天遇到一个比较神奇的问题,客户某套测试数据库断电重启了,重启时发现数据库提示 ORA-01157: cannot identify/lock data file和ORA-01110的错误,经过检查发现是系统启动后未挂载存储,表空间都放在存储盘上,手工挂载存储后所有问题迎刃而解。当时没有记录问题,这里通过测试环境模拟重现问题。
制造实验数据
[oracle@XLJ181 ~]$ sqlplus / as sysdba SQL*Plus: Release 11.2.0.4.0 Production on Mon Dec 10 19:27:14 2018 Copyright (c) 1982, 2013, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production With the Partitioning, OLAP, Data Mining and Real Application Testing options SYS@cams> create tablespace test datafile '/home/oracle/test.dbf' size 100M; Tablespace created. SYS@cams> create user test identified by 123456 default tablespace test; User created. SYS@cams> grant connect,resource to test; Grant succeeded. TEST@cams> create table test(id number primary key,name varchar2(20)); Table created. TEST@cams> insert into test values(1,'bob'); 1 row created. TEST@cams> insert into test values(2,'joe'); 1 row created. TEST@cams> select count(*) from test; COUNT(*) ---------- 2 TEST@cams> conn / as sysdba Connected. SYS@cams> shutdown immediate; Database closed. Database dismounted. ORACLE instance shut down. SYS@cams> exit Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production With the Partitioning, OLAP, Data Mining and Real Application Testing options
模拟文件误删除
[oracle@XLJ181 ~]$ mv /home/oracle/test.dbf /home/oracle/test.dbf.bak
故障出现
启动数据库,发现数据文件不存在:
[oracle@XLJ181 ~]$ sqlplus / as sysdba SQL*Plus: Release 11.2.0.4.0 Production on Mon Dec 10 19:38:26 2018 Copyright (c) 1982, 2013, Oracle. All rights reserved. Connected to an idle instance. SYS@cams> startup; ORACLE instance started. Total System Global Area 5344731136 bytes Fixed Size 2262656 bytes Variable Size 1040189824 bytes Database Buffers 4294967296 bytes Redo Buffers 7311360 bytes Database mounted. ORA-01157: cannot identify/lock data file 63 - see DBWR trace file ORA-01110: data file 63: '/home/oracle/test.dbf'
查看trace文件:
Mon Dec 10 19:38:35 2018 ALTER DATABASE OPEN Errors in file /u01/app/oracle/diag/rdbms/cams/cams/trace/cams_dbw0_21153.trc: ORA-01157: cannot identify/lock data file 63 - see DBWR trace file ORA-01110: data file 63: '/home/oracle/test.dbf' ORA-27037: unable to obtain file status Linux-x86_64 Error: 2: No such file or directory Additional information: 3 Errors in file /u01/app/oracle/diag/rdbms/cams/cams/trace/cams_ora_21175.trc: ORA-01157: cannot identify/lock data file 63 - see DBWR trace file ORA-01110: data file 63: '/home/oracle/test.dbf' ORA-1157 signalled during: ALTER DATABASE OPEN...
查看 cams_ora_21175.trc文件,报错信息如下:
DDE: Problem Key 'ORA 1110' was flood controlled (0x1) (no incident) ORA-01110: data file 63: '/home/oracle/test.dbf' ORA-01157: cannot identify/lock data file 63 - see DBWR trace file ORA-01110: data file 63: '/home/oracle/test.dbf'
查看 cams_dbw0_21153.trc文件,报错信息如下:
ORA-01157: cannot identify/lock data file 63 - see DBWR trace file ORA-01110: data file 63: '/home/oracle/test.dbf' ORA-27037: unable to obtain file status Linux-x86_64 Error: 2: No such file or directory Additional information: 3
问题已经很明显了,就是找不到 data file 63: '/home/oracle/test.dbf'。 针对该问题,我们应该怎么去处理呢?特别是测试环境,一般为了节约资源,不会开启归档,更不会有RMAN备份,那怎么让数据库跑起来,让数据损失降到最低呢?
常用解决方案: offline drop+recreate
SQL> shutdown immediate; SQL> startup mount; SQL> alter database datafile '/home/oracle/test.dbf' offline drop; SQL> alter database open; SQL> drop tablespace test including contents; --注意:执行之前检查是否还有其他文件属于该表空间 SQL> create tablespace test datafile '/home/oracle/test.dbf' size 100M;
因为是测试环境,想办法重建数据或者利用最近的逻辑备份或其他测试导入数据,这样能把数据损失降到最低。 如果删除的是核心系统的表空间,那么还不如重建表空间之后把相关数据清理之后重新导入一份。
