1 、 背景介绍
Redo 日志是 Oracle 数据库至关重要的组成部分,用于记录数据库中所有数据的变更操作。它保证了数据库的事务一致性和可恢复性,在数据库发生故障时,通过重做 Redo 日志中记录的操作,能够将数据库恢复到故障前的状态。然而,在实际运维过程中,可能会因误操作导致 Redo 日志损坏或被删除。本文将主要介绍在 非归档模式下 Redo 日志的不同状态( inactive 、 active 、 current ),详细介绍相应的恢复步骤和技术要点,以便在遇到此类问题时能够快速、有效地进行恢复。
2 、 Online Redo Inactive 故障场景
在非归档模式下, inactive 状态的 Redo 日志组虽然已不再被当前事务使用,但由于没有归档日志的保护,其内容一旦丢失,将无法通过归档日志进行恢复。不过,因为其不影响当前事务处理,所以相对 active 和 current 状态的损坏,影响较小。SQL> select * from v$log; GROUP# THREAD# SEQUENCE# BYTES BLOCKSIZE MEMBERS ARC STATUS FIRST_CHANGE# FIRST_TIM NEXT_CHANGE# NEXT_TIME---------- ---------- ---------- ---------- ---------- ---------- --- ---------------- ------------- --------- ------------ --------- 1 1 109 52428800 512 1 YES INACTIVE 5085400 24-JUN-20 5105406 24-JUN-20 2 1 110 52428800 512 1 YES INACTIVE 5105406 24-JUN-20 5125412 24-JUN-20 3 1 111 52428800 512 1 NO CURRENT 5125412 24-JUN-20 2.8147E+14 SQL> select * from v$logfile where group#=2; GROUP# STATUS TYPE MEMBER IS_---------- ------- --------------------------------------------------------------------- 2 ONLINE /u01/app/oradata/localdb/redo02.log NO [root@rac01 ~]# dd if=/dev/null of=/u01/app/oradata/localdb/redo02.log bs=512 count=300+0 records in0+0 records out0 bytes (0 B) copied, 0.000209111 s, 0.0 kB/s
可以正常关闭SQL> shutdown immediateDatabase closed.Database dismounted.ORACLE instance shut down.
开启报错SQL> startupORACLE instance started.Total System Global Area 521936896 bytesFixed Size 2254824 bytesVariable Size 264243224 bytesDatabase Buffers 251658240 bytesRedo Buffers 3780608 bytesDatabase mounted.ORA-03113: end-of-file on communication channelProcess ID: 13719Session ID: 1 Serial number: 5 alert.logARC0: STARTING ARCH PROCESSESErrors in file /u01/app/diag/rdbms/localdb/localdb/trace/localdb_lgwr_13817.trc:
ORA-00313: open failed for members of log group 2 of thread 1
ORA-00312: online log 2 thread 1: '/u01/app/oradata/localdb/redo02.log'
ORA-27047: unable to read the header block of file —redo group 2 的文件头部损坏Linux-x86_64 Error: 25: Inappropriate ioctl for deviceAdditional information: 1Errors in file /u01/app/diag/rdbms/localdb/localdb/trace/localdb_lgwr_13817.trc:ORA-00313: open failed for members of log group 2 of thread 1ORA-00312: online log 2 thread 1: '/u01/app/oradata/localdb/redo02.log'ORA-27047: unable to read the header block of fileLinux-x86_64 Error: 25: Inappropriate ioctl for deviceAdditional information: 1Errors in file /u01/app/diag/rdbms/localdb/localdb/trace/localdb_ora_13829.trc:ORA-00313: open failed for members of log group 1 of threadORA-00312: online log 2 thread 1: '/u01/app/oradata/localdb/redo02.log'Thu Jun 25 00:53:01 2020ARC1 started with pid=24, OS id=13855USER (ospid: 13829): terminating the instance due to error 313System state dump requested by (instance=1, osid=13829), summary=[abnormal instance termination].System State dumped to trace file /u01/app/diag/rdbms/localdb/localdb/trace/localdb_diag_13797_20200625005301.trcDumping diagnostic data in directory=[cdmp_20200625005301], requested by (instance=1, osid=13829), summary=[abnormal instance termination].Instance terminated by USER, pid = 13829 SQL> conn / as sysdbaConnected to an idle instance.SQL> startup mountORACLE instance started.Total System Global Area 521936896 bytesFixed Size 2254824 bytesVariable Size 264243224 bytesDatabase Buffers 251658240 bytesRedo Buffers 3780608 bytesDatabase mounted.SQL>SQL> set linesize 400SQL> select * from v$log; GROUP# THREAD# SEQUENCE# BYTES BLOCKSIZE MEMBERS ARC STATUS FIRST_CHANGE# FIRST_TIM NEXT_CHANGE# NEXT_TIME---------- ---------- ---------- ---------- ---------- ---------- --- ---------------- ------------- --------- ------------ --------- 1 1 109 52428800 512 1 YES INACTIVE 5085400 24-JUN-20 5105406 24-JUN-20 3 1 111 52428800 512 1 NO CURRENT 5125412 24-JUN-20 2.8147E+14 2 1 110 52428800 512 1 YES INACTIVE 5105406 24-JUN-20 5125412 24-JUN-20
inactive 的 redo 损坏 SQL> alter database clear logfile group 2; SQL> alter database drop logfile group 2; SQL> alter database add logfile group 2 ('/u01/app/oradata/localdb/redo02.log') size 50M reuse; SQL> alter database open;
3 、 Online Redo Active 故障场景
非归档模式下 active 状态的 Redo 日志组损坏,由于没有归档日志可供恢复,会导致数据库无法完整恢复到故障前的状态,可能会丢失部分未提交或已提交但尚未写入数据文件的事务,对数据一致性造成严重破坏。session 1:begin for i in 1 .. 5000000 loop insert into C##MARY.test values(i); commit; end loop; end; / session 2:SQL> set linesize 300SQL> select * from v$log; GROUP# THREAD# SEQUENCE# BYTES BLOCKSIZE MEMBERS ARC STATUS FIRST_CHANGE# FIRST_TIM NEXT_CHANGE# NEXT_TIME---------- ---------- ---------- ---------- ---------- ---------- --- ---------------- ------------- --------- ------------ --------- 1 1 4 52428800 512 1 NO CURRENT 5465964 25-JUN-20 2.8147E+14 2 1 0 52428800 512 1 YES UNUSED 0 0 3 1 3 52428800 512 1 YES INACTIVE 5383600 25-JUN-20 5465964 25-JUN-20 SQL> alter system switch logfile;System altered.SQL> //System altered.System altered. SQL> select * from v$log; GROUP# THREAD# SEQUENCE# BYTES BLOCKSIZE MEMBERS ARC STATUS FIRST_CHANGE# FIRST_TIM NEXT_CHANGE# NEXT_TIME---------- ---------- ---------- ---------- ---------- ---------- --- ---------------- ------------- --------- ------------ --------- 1 1 7 52428800 512 1 NO ACTIVE 5634319 25-JUN-20 5664004 25-JUN-20 2 1 8 52428800 512 1 NO CURRENT 5664004 25-JUN-20 2.8147E+14 3 1 6 52428800 512 1 NO ACTIVE 5584049 25-JUN-20 5634319 25-JUN-20 SQL> shutdown abortORACLE instance shut down. SQL> startup mountORACLE instance started.Total System Global Area 521936896 bytesFixed Size 2254824 bytesVariable Size 264243224 bytesDatabase Buffers 251658240 bytesRedo Buffers 3780608 bytesDatabase mounted. SQL> set linesize 300SQL> select * from v$log; GROUP# THREAD# SEQUENCE# BYTES BLOCKSIZE MEMBERS ARC STATUS FIRST_CHANGE# FIRST_TIM NEXT_CHANGE# NEXT_TIME---------- ---------- ---------- ---------- ---------- ---------- --- ---------------- ------------- --------- ------------ --------- 1 1 7 52428800 512 1 NO ACTIVE 5634319 25-JUN-20 5664004 25-JUN-20 3 1 6 52428800 512 1 NO ACTIVE 5584049 25-JUN-20 5634319 25-JUN-20 2 1 8 52428800 512 1 NO CURRENT 5664004 25-JUN-20 2.8147E+14 SQL> select member from v$logfile where group#=1;MEMBER-----------------------------------------------------/u01/app/oradata/localdb/redo01.log [root@rac01 ~]# dd if=/dev/null of=/u01/app/oradata/localdb/redo01.log bs=512 count=300+0 records in0+0 records out0 bytes (0 B) copied, 0.000134984 s, 0.0 kB/s SQL> alter database open ;ERROR at line 1:ORA-00313: open failed for members of log group 1 of thread 1ORA-00312: online log 1 thread 1: '/u01/app/oradata/localdb/redo01.log'ORA-27047: unable to read the header block of fileLinux-x86_64 Error: 25: Inappropriate ioctl for deviceAdditional information: 1 SQL> shutdown abortORACLE instance shut down.
pfile.ora 中添加两个参数
*._allow_resetlogs_corruption=true
*._allow_error_simulation=true SQL> startup mount pfile='/tmp/pfile.ora';ORACLE instance started.Total System Global Area 521936896 bytesFixed Size 2254824 bytesVariable Size 264243224 bytesDatabase Buffers 251658240 bytesRedo Buffers 3780608 bytesDatabase mounted. SQL> select file#,checkpoint_change# from v$datafile; FILE# CHECKPOINT_CHANGE#---------- ------------------ 1 5584049 2 5584049 3 5584049 4 5584049 5 5584049 6 5584049 7 5584049 8 5584049 9 55840499 rows selected. SQL> select file#,checkpoint_change# from v$datafile_header; FILE# CHECKPOINT_CHANGE#---------- ------------------ 1 5584049 2 5584049 3 5584049 4 5584049 5 5584049 6 5584049 7 5584049 8 5584049 9 55840499 rows selected. SQL> select SEQUENCE#,FIRST_CHANGE#,NEXT_CHANGE# from v$log;SEQUENCE# FIRST_CHANGE# NEXT_CHANGE#---------- ------------- ------------ 7 5634319 5664004 6 5584049 5634319 8 5664004 2.8147E+14 SQL> recover database until cancel;ORA-00279: change 5584049 generated at 06/25/2020 02:21:21 needed for thread 1ORA-00289: suggestion : /u01/app/arch/1_6_1043978409.dbfORA-00280: change 5584049 for thread 1 is in sequence #6Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
6 号归档没有,但是 seq# 为 6 的 redo 没有损坏,输入
/u01/app/oradata/localdb/redo03.logORA-00279: change 5634319 generated at 06/25/2020 02:21:25 needed for thread 1ORA-00289: suggestion : /u01/app/arch/1_7_1043978409.dbfORA-00280: change 5634319 for thread 1 is in sequence #7ORA-00278: log file '/u01/app/oradata/localdb/redo03.log' no longer needed for this recoverySpecify log: {<RET>=suggested | filename | AUTO | CANCEL}
7 号归档没有,而且 seq# 为 7 的 redo 已损坏,输入 cancel
cancelORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error belowORA-01194: file 1 needs more recovery to be consistentORA-01110: data file 1: '/u01/app/oradata/localdb/system01.dbf'ORA-01112: media recovery not started SQL> alter database open resetlogs;Database altered.
本次实验破坏的不是最早的 seq# active ,如果破坏最早的 seq# ,会命中没有可以恢复的 redo ,将会报错 ora 600 [2662] 。
4 、 Online Redo Current 故障场景
非归档模式下 current 状态的 Redo 日志组损坏,是最为严重的情况,数据库将无法继续正常运行,并且由于缺乏归档日志的支持,几乎无法完整恢复到故障前的状态,会导致大量数据丢失和事务不一致。session 1:begin for i in 1 .. 5000000 loop insert into MARY.t values(i); commit; end loop; end; / session 2:SQL> set linesize 300SQL> select * from v$log; GROUP# THREAD# SEQUENCE# BYTES BLOCKSIZE MEMBERS ARC STATUS FIRST_CHANGE# FIRST_TIM NEXT_CHANGE# NEXT_TIME---------- ---------- ---------- ---------- ---------- ---------- --- ---------------- ------------- --------- ------------ --------- 1 1 22 52428800 512 1 NO INACTIVE 1074269901 11-JUL-20 1074269904 11-JUL-20 2 1 23 52428800 512 1 NO ACTIVE 1074269904 11-JUL-20 1074298682 11-JUL-20 3 1 24 52428800 512 1 NO CURRENT 1074298682 11-JUL-20 2.8147E+14 SQL> alter system switch logfile;//// SQL> select * from v$log; GROUP# THREAD# SEQUENCE# BYTES BLOCKSIZE MEMBERS ARC STATUS FIRST_CHANGE# FIRST_TIM NEXT_CHANGE# NEXT_TIME---------- ---------- ---------- ---------- ---------- ---------- --- ---------------- ------------- --------- ------------ --------- 1 1 28 52428800 512 1 NO CURRENT 1074497408 11-JUL-20 2.8147E+14 2 1 26 52428800 512 1 NO ACTIVE 1074442189 11-JUL-20 1074466376 11-JUL-20 3 1 27 52428800 512 1 NO ACTIVE 1074466376 11-JUL-20 1074497408 11-JUL-20 SQL> shutdown abortORACLE instance shut down. SQL> startup mountORACLE instance started.Total System Global Area 521936896 bytesFixed Size 2254824 bytesVariable Size 264243224 bytesDatabase Buffers 251658240 bytesRedo Buffers 3780608 bytesDatabase mounted. SQL> set linesize 300SQL> select * from v$log; GROUP# THREAD# SEQUENCE# BYTES BLOCKSIZE MEMBERS ARC STATUS FIRST_CHANGE# FIRST_TIM NEXT_CHANGE# NEXT_TIME---------- ---------- ---------- ---------- ---------- ---------- --- ---------------- ------------- --------- ------------ --------- 1 1 28 52428800 512 1 NO CURRENT 1074497408 11-JUL-20 2.8147E+14 3 1 27 52428800 512 1 NO ACTIVE 1074466376 11-JUL-20 1074497408 11-JUL-20 2 1 26 52428800 512 1 NO ACTIVE 1074442189 11-JUL-20 1074466376 11-JUL-20 SQL> select member from v$logfile where group#=1;MEMBER---------------------------------------------/u01/app/oradata/localdb/redo01.log SQL> !dd if=/dev/null of=/u01/app/oradata/localdb/redo01.log bs=512 count=300+0 records in0+0 records out0 bytes (0 B) copied, 0.000116154 s, 0.0 kB/s SQL> alter database open;ERROR at line 1:ORA-00313: open failed for members of log group 1 of thread 1ORA-00312: online log 1 thread 1: '/u01/app/oradata/localdb/redo01.log'ORA-27047: unable to read the header block of fileLinux-x86_64 Error: 25: Inappropriate ioctl for deviceAdditional information: 1 vi /tmp/pfile.ora
*._allow_resetlogs_corruption=true
*._allow_error_simulation=true SQL> startup mount pfile='/tmp/pfile.ora'ORACLE instance started.Total System Global Area 521936896 bytesFixed Size 2254824 bytesVariable Size 264243224 bytesDatabase Buffers 251658240 bytesRedo Buffers 3780608 bytesDatabase mounted. SQL> select group#,member from v$logfile; GROUP# MEMBER---------- ---------------------------------------- 1 /u01/app/oradata/localdb/redo01.log 2 /u01/app/oradata/localdb/redo02.log 3 /u01/app/oradata/localdb/redo03.log SQL>SQL> select * from v$log; GROUP# THREAD# SEQUENCE# BYTES BLOCKSIZE MEMBERS ARC STATUS FIRST_CHANGE# FIRST_TIM NEXT_CHANGE# NEXT_TIME---------- ---------- ---------- ---------- ---------- ---------- --- ---------------- ------------- --------- ------------ --------- 1 1 28 52428800 512 1 NO CURRENT 1074497408 11-JUL-20 2.8147E+14 3 1 27 52428800 512 1 NO ACTIVE 1074466376 11-JUL-20 1074497408 11-JUL-20 2 1 26 52428800 512 1 NO ACTIVE 1074442189 11-JUL-20 1074466376 11-JUL-20 SQL> recover database until cancel;ORA-00279: change 1074442189 generated at 07/11/2020 02:49:30 needed for thread 1ORA-00289: suggestion : /u01/app/arch/1_26_1045445359.dbfORA-00280: change 1074442189 for thread 1 is in sequence #26Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
要恢复 seq# 26 ,是 group# 2 ,没有损坏
/u01/app/oradata/localdb/redo02.logORA-00279: change 1074466376 generated at 07/11/2020 02:49:33 needed for thread 1ORA-00289: suggestion : /u01/app/arch/1_27_1045445359.dbfORA-00280: change 1074466376 for thread 1 is in sequence #27ORA-00278: log file '/u01/app/oradata/localdb/redo02.log' no longer needed for this recoverySpecify log: {<RET>=suggested | filename | AUTO | CANCEL}
要恢复 seq# 27 ,是 group# 3 ,没有损坏
/u01/app/oradata/localdb/redo03.logORA-00279: change 1074497408 generated at 07/11/2020 02:49:36 needed for thread 1ORA-00289: suggestion : /u01/app/arch/1_28_1045445359.dbfORA-00280: change 1074497408 for thread 1 is in sequence #28ORA-00278: log file '/u01/app/oradata/localdb/redo03.log' no longer needed for this recoverySpecify log: {<RET>=suggested | filename | AUTO | CANCEL}
要恢复 seq# 28 ,是 group# 3 ,已损坏
cancelORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error belowORA-01194: file 1 needs more recovery to be consistentORA-01110: data file 1: '/u01/app/oradata/localdb/system01.dbf'ORA-01112: media recovery not started SQL> select file#,checkpoint_change#,fuzzy from v$datafile_header; FILE# CHECKPOINT_CHANGE# FUZ---------- ------------------ --- 1 1074497408 YES 2 1074497408 YES 4 1074497408 YES 5 1074497408 YES 6 1074497408 YES 8 1074497408 YES6 rows selected. SQL> alter database open resetlogs;Database altered.
