本次测试目的 模拟ORACLE 11G RAC 共享存储OCR 磁盘损坏恢复测试。
|
序号 |
测试内容 |
测试时间 |
|
1 |
模拟ORACLE 11G RAC 共享存储OCR 磁盘损坏恢复测试。 |
1 小时 |
注意事项 一定要确认 2 个节点的共享磁盘权限和属主正确。 如下命令检查
ls -la /dev/rhdisk
chown grid:asmadmin /dev/rhdisk
chmod 660 /dev/rhdisk
1. 准备工作
1.1. 共享存储检查
确认状态 Available
[cspdbtest1:/]lsdev -Cc disk
hdisk0 Available 11-T1-01 MPIO FC 2145
hdisk1 Available 11-T1-01 MPIO FC 2145
hdisk2 Available 11-T1-01 MPIO FC 2145
hdisk12 Available 11-T1-01 MPIO FC 2145
hdisk13 Available 11-T1-01 MPIO FC 2145
[cspdbtest1:/]lspv
hdisk12 00cb06f72bee733f rootvg active
hdisk13 00cb06f72beede57 datavg active
hdisk0 none None
hdisk1 none None
hdisk2 none None
[cspdbtest1:/]lscfg -vpl hdisk0
\ hdisk0 U9179.MHD.84434BW-V10-C11-T1-W500507680B246D58-L2000000000000 MPIO FC 2145
Manufacturer................IBM
Machine Type and Model......2145
ROS Level and ID............0000
Device Specific.(Z0)........0000063268181002
Device Specific.(Z1)........0100206
Serial Number...............60050764008181382800000000000115
PLATFORM SPECIFIC
Name: disk
Node: disk
Device Type: block
[cspdbtest1:/]
[cspdbtest1:/]lscfg -vpl hdisk1
hdisk1 U9179.MHD.84434BW-V10-C11-T1-W500507680B246D58-L3000000000000 MPIO FC 2145
Manufacturer................IBM
Machine Type and Model......2145
ROS Level and ID............0000
Device Specific.(Z0)........0000063268181002
Device Specific.(Z1)........0100206
Serial Number...............60050764008181382800000000000113
PLATFORM SPECIFIC
Name: disk
Node: disk
Device Type: block
[cspdbtest1:/]lscfg -vpl hdisk2
hdisk2 U9179.MHD.84434BW-V10-C11-T1-W500507680B246D58-L4000000000000 MPIO FC 2145
Manufacturer................IBM
Machine Type and Model......2145
ROS Level and ID............0000
Device Specific.(Z0)........0000063268181002
Device Specific.(Z1)........0100206
Serial Number...............60050764008181382800000000000114
PLATFORM SPECIFIC
Name: disk
Node: disk
Device Type: block
[cspdbtest1:/]lsdev -Cc disk
hdisk0 Available 11-T1-01 MPIO FC 2145
hdisk1 Available 11-T1-01 MPIO FC 2145
hdisk2 Available 11-T1-01 MPIO FC 2145
hdisk12 Available 11-T1-01 MPIO FC 2145
hdisk13 Available 11-T1-01 MPIO FC 2145
[cspdbtest1:/]
[cspdbtest1:/]lsvg -l datavg
datavg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
lv_oracle jfs2 360 360 1 open/syncd /oracle
loglv00 jfs2log 1 1 1 open/syncd N/A
[cspdbtest1:/]lsvg -l rootvg
rootvg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
hd5 boot 1 1 1 closed/syncd N/A
hd6 paging 128 128 1 open/syncd N/A
hd8 jfs2log 1 1 1 open/syncd N/A
hd4 jfs2 16 16 1 open/syncd /
hd2 jfs2 40 40 1 open/syncd /usr
hd9var jfs2 40 40 1 open/syncd /var
hd3 jfs2 80 80 1 open/syncd /tmp
hd1 jfs2 8 8 1 open/syncd /home
hd10opt jfs2 3 3 1 open/syncd /opt
hd11admin jfs2 1 1 1 open/syncd /admin
livedump jfs2 2 2 1 open/syncd /var/adm/ras/livedump
lv_software jfs2 400 400 1 open/syncd /software
log_lv jfs2 16 16 1 open/syncd /log
lg_dumplv sysdump 16 16 1 open/syncd N/A
[cspdbtest1:/dev]lsvg -p datavg
datavg:
PV_NAME PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTION
hdisk13 active 619 258 123..00..00..11..124
[cspdbtest1:/dev]lsvg -p rootvg
rootvg:
PV_NAME PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTION
hdisk12 active 1199 447 239..78..00..00..130
[cspdbtest1:/home/grid]ls -l /dev/hdisk*
brw-rw---- 1 grid asmadmin 21, 0 Aug 26 15:32 /dev/hdisk0
brw-rw---- 1 grid asmadmin 21, 2 Aug 26 15:32 /dev/hdisk1
brw------- 1 root system 21, 13 Aug 26 14:05 /dev/hdisk12
brw------- 1 root system 21, 12 Aug 26 14:05 /dev/hdisk13
brw-rw---- 1 grid asmadmin 21, 1 Aug 26 15:32 /dev/hdisk2 查看磁盘容量
[test1:/]getconf 'DISK_SIZE' /dev/hdisk3
20480
1.2. oracle 备份& 基础工作
1.2.1. 备份pfile 文件
SQL> create pfile='/oracle/app/grid/asmpfile.ora' from spfile;
File created.
[test1:/oracle/app/grid]ls -la
total 8
drwxrwxr-x 7 grid oinstall 256 Sep 09 16:46 .
drwxr-xr-x 6 root oinstall 256 Feb 23 2016 ..
drwxr-xr-x 2 grid oinstall 256 Sep 09 15:03 Clusterware
-rw-r--r-- 1 grid oinstall 257 Sep 09 16:46 asmpfile.ora
drwxr-x--- 4 grid oinstall 256 Sep 09 15:19 cfgtoollogs
drwxr-xr-x 2 grid oinstall 256 Sep 09 15:23 checkpoints
drwxrwxr-x 4 grid oinstall 256 Sep 09 15:19 diag
drwxr-xr-x 3 grid oinstall 256 Sep 09 15:03 test1
1.2.2. 查看OCR 备份
[test1:/]ocrconfig -showbackup
test1 2020/09/10 07:25:43 /oracle/app/11.2.0.4/grid/cdata/test/backup00.ocr
test1 2020/09/10 03:25:42 /oracle/app/11.2.0.4/grid/cdata/test/backup01.ocr
test1 2020/09/09 23:25:41 /oracle/app/11.2.0.4/grid/cdata/test/backup02.ocr
test1 2020/09/09 19:25:40 /oracle/app/11.2.0.4/grid/cdata/test/day.ocr
test1 2020/09/09 19:25:40 /oracle/app/11.2.0.4/grid/cdata/test/week.ocr
test1 2020/09/09 16:28:37 /oracle/app/11.2.0.4/grid/cdata/test/backup_20200909_162837.ocr
[test1:/]
1.2.3. 查看集群状态
[test1:/]crsctl stat res -t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA1.dg
ONLINE ONLINE test1
ONLINE ONLINE test2
ora.LISTENER.lsnr
ONLINE ONLINE test1
ONLINE ONLINE test2
ora.OCRDATA.dg
ONLINE ONLINE test1
ONLINE ONLINE test2
ora.asm
ONLINE ONLINE test1 Started
ONLINE ONLINE test2 Started
ora.gsd
OFFLINE OFFLINE test1
OFFLINE OFFLINE test2
ora.net1.network
ONLINE ONLINE test1
ONLINE ONLINE test2
ora.ons
ONLINE ONLINE test1
ONLINE ONLINE test2
ora.registry.acfs
ONLINE ONLINE test1
ONLINE ONLINE test2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE test1
ora.cvu
1 ONLINE ONLINE test1
ora.oc4j
1 ONLINE ONLINE test1
ora.scan1.vip
1 ONLINE ONLINE test1
ora.test1.vip
1 ONLINE ONLINE test1
ora.test2.vip
1 ONLINE ONLINE test2
ora.testdb.db
1 ONLINE ONLINE test1 Open
2 ONLINE ONLINE test2 Open
1.2.4. Ocr 磁盘检查
[test1:/]ocrcheck
Status of Oracle Cluster Registry is as follows :
Version : 3
Total space (kbytes) : 262120
Used space (kbytes) : 3112
Available space (kbytes) : 259008
ID : 1116292755
Device/File Name : +ocrdata
Device/File integrity check succeeded
Device/File not configured
Device/File not configured
Device/File not configured
Device/File not configured
Cluster registry integrity check succeeded
Logical corruption check succeeded
[test1:/]crsctl query css votedisk
## STATE File Universal Id File Name Disk group
-- ----- ----------------- --------- ---------
1. ONLINE 4955f344cb664fb1bf2bdb459d0e6cee (/dev/rhdisk0) [OCRDATA]
Located 1 voting disk(s).
[test1:/]
2. OCR 磁盘损坏恢复
2.1. 通过dd 的方式模拟磁盘损坏
dd if=/dev/zero of=/dev/rhdisk0 bs=1M count=10
2.2. 停止集群
[test1:/]crsctl stop crs
2.3. 再次启动集群报错
[test1:/]crsctl start crs
CRS-4123: Oracle High Availability Services has been started.
[test1:/]crsctl stat res -t
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4000: Command Status failed, or completed with errors.
2.4. 独占模式启动
[test1:/] crsctl start crs -excl -nocrs
CRS-4123: Oracle High Availability Services has been started.
CRS-2672: Attempting to start 'ora.mdnsd' on 'test1'
CRS-2676: Start of 'ora.mdnsd' on 'test1' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'test1'
CRS-2676: Start of 'ora.gpnpd' on 'test1' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'test1'
CRS-2672: Attempting to start 'ora.gipcd' on 'test1'
CRS-2676: Start of 'ora.cssdmonitor' on 'test1' succeeded
CRS-2676: Start of 'ora.gipcd' on 'test1' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'test1'
CRS-2672: Attempting to start 'ora.diskmon' on 'test1'
CRS-2676: Start of 'ora.diskmon' on 'test1' succeeded
CRS-2676: Start of 'ora.cssd' on 'test1' succeeded
CRS-2672: Attempting to start 'ora.drivers.acfs' on 'test1'
CRS-2679: Attempting to clean 'ora.cluster_interconnect.haip' on 'test1'
CRS-2672: Attempting to start 'ora.ctssd' on 'test1'
CRS-2681: Clean of 'ora.cluster_interconnect.haip' on 'test1' succeeded
CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'test1'
CRS-2676: Start of 'ora.ctssd' on 'test1' succeeded
CRS-2676: Start of 'ora.drivers.acfs' on 'test1' succeeded
CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'test1' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'test1'
CRS-2676: Start of 'ora.asm' on 'test1' succeeded
2.5. 创建磁盘组
[CSP-DB-1:grid:/home/grid]sqlplus / as sysasm
SQL*Plus: Release 11.2.0.4.0 Production on Thu Sep 10 09:40:03 2020
Copyright (c) 1982, 2013, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options
SQL> create diskgroup OCRDATA external redundancy disk '/dev/rhdisk3' attribute 'compatible.asm'='11.2.0.0.0', 'compatible.rdbms'='11.2.0.0.0';
Diskgroup created.
2.6. 创建spfile 文件
SQL> create spfile='+OCRDATA' from pfile='/oracle/app/grid/asmpfile1.ora';
File created.
2.7. 恢复ocr
[test1:/]ocrconfig -restore /oracle/app/11.2.0.4/grid/cdata/test/backup00.ocr
[test1:/]ocrcheck
Status of Oracle Cluster Registry is as follows :
Version : 3
Total space (kbytes) : 262120
Used space (kbytes) : 3112
Available space (kbytes) : 259008
ID : 1116292755
Device/File Name : +ocrdata
Device/File integrity check succeeded
Device/File not configured
Device/File not configured
Device/File not configured
Device/File not configured
Cluster registry integrity check succeeded
Logical corruption check succeeded
2.8. 恢复votingdisk
[test1:/]crsctl query css votedisk
Located 0 voting disk(s).
[test1:/]crsctl replace votedisk +OCRDATA
Successful addition of voting disk 4c52e7685d004f0ebfcd3dbf7a610d69.
Successfully replaced voting disk group with +OCRDATA.
CRS-4266: Voting file(s) successfully replaced
[test1:/]crsctl query css votedisk
## STATE File Universal Id File Name Disk group
-- ----- ----------------- --------- ---------
1. ONLINE 4c52e7685d004f0ebfcd3dbf7a610d69 (/dev/rhdisk3) [OCRDATA]
Located 1 voting disk(s).
2.9. 停掉crs
[test1:/]crsctl stop crs -f
2.10. 正常启动crs
Crsctl start crs
