1 巡检数据库发现如下告警:2021-08-16 02:13:59.207: [crflogd(6575)]CRS-9520:The storage of Grid Infrastructure Management Repository is 92% full. The storage location is '/data/grid/11.2.0/crf/db/host1'.2021-08-16 02:19:04.207: [crflogd(6575)]CRS-9520:The storage of Grid Infrastructure Management Repository is 93% full. The storage location is '/data/grid/11.2.0/crf/db/host1'. 2 查看主机目录使用情况,磁盘目录使用正常[root@host1 ~]# df -hFilesystem Size Used Avail Use% Mounted on/dev/mapper/VolGroup-lv_root 50G 12G 36G 24% /tmpfs 7.9G 297M 7.6G 4% /dev/shm/dev/sda2 485M 62M 398M 14% /boot/dev/sda1 200M 260K 200M 1% /boot/efi/dev/mapper/VolGroup-lv_home 77G 47G 27G 64% /home 3 查看集群状态正常 [grid@host1 host1]$ crsctl stat res -t -------------------------------------------------------------------------------- NAME TARGET STATE SERVER STATE_DETAILS -------------------------------------------------------------------------------- Local Resources -------------------------------------------------------------------------------- ora.ARCHIVELOG.dg ONLINE ONLINE host1 ONLINE ONLINE host2 ora.DATA.dg ONLINE ONLINE host1 ONLINE ONLINE host2 ora.FLASHBAK.dg ONLINE ONLINE host1 ONLINE ONLINE host2 ora.LISTENER.lsnr ONLINE ONLINE host1 ONLINE ONLINE host2 ora.DATA1.dg ONLINE ONLINE host1 ONLINE ONLINE host2 ora.DATA2.dg ONLINE ONLINE host1 ONLINE ONLINE host2 ora.asm ONLINE ONLINE host1 Started ONLINE ONLINE host2 Started ora.gsd OFFLINE OFFLINE host1 OFFLINE OFFLINE host2 ora.net1.network ONLINE ONLINE host1 ONLINE ONLINE host2 ora.ons ONLINE ONLINE host1 ONLINE ONLINE host2 ora.registry.acfs ONLINE ONLINE host1 ONLINE ONLINE host2 -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.LISTENER_SCAN1.lsnr 1 ONLINE ONLINE host1 ora.cvu 1 ONLINE ONLINE host1 ora.oc4j 1 OFFLINE OFFLINE ora.scan1.vip 1 ONLINE ONLINE host1 ora.ywtest.db 1 ONLINE ONLINE host2 Open 2 ONLINE ONLINE host1 Open ora.host1.vip 1 ONLINE ONLINE host1 ora.host2.vip 1 ONLINE ONLINE host2 4 查看CRF资源的状态,为ONLINE ,使用空间为290M左右,使用的不多 [grid@host1 host1]$ crsctl stat res ora.crf -init -t -------------------------------------------------------------------------------- NAME TARGET STATE SERVER STATE_DETAILS -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.crf 1 ONLINE ONLINE host1 查看CRF相关的进程 [grid@host1 host1]$ ps -ef|grep osysmond root 5329 1 4 Apr10 ? 5-10:43:26 /data/grid/11.2.0/bin/osysmond.bin grid 15537 12181 0 14:43 pts/25 00:00:00 grep osysmond [grid@host1 host1]$ ps -ef|grep ologgerd root 6575 1 1 Apr10 ? 2-08:01:02 /data/grid/11.2.0/bin/ologgerd -m host2 -r -d /data/grid/11.2.0/crf/db/host1 grid 16077 12181 0 14:45 pts/25 00:00:00 grep ologgerd 查看CRF对应的目录,发现使用的不多,总共就290M [root@host1 host1]# du -k 290396 . 使用如下命令,查看CRF 采集数据的时间,单位为秒,17个小时,时间不长。 [grid@host1 ~]$ oclumon query> manage -get repsize CHM Repository Size = 61511 Done 5 使用如下方法清理目录 检查ora.crf状态 /data/grid/11.2.0/bin/crsctl stat res ora.crf -init -t 停止ora.crf /data/grid/11.2.0/bin/crsctl stop res ora.crf -init 检查ora.crf状态 /data/grid/11.2.0/bin/crsctl stat res ora.crf -init -t 删除日志: rm crf*.bdb 启动ora.crf /data/grid/11.2.0/bin/crsctl start res ora.crf -init [root@host1 /]# /data/grid/11.2.0/bin/crsctl stat res ora.crf -init -t -------------------------------------------------------------------------------- NAME TARGET STATE SERVER STATE_DETAILS -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.crf 1 ONLINE ONLINE host1 [root@host1 /]# /data/grid/11.2.0/bin/crsctl stop res ora.crf -init CRS-2673: Attempting to stop 'ora.crf' on 'host1' CRS-2677: Stop of 'ora.crf' on 'host1' succeeded [root@host1 host1]# pwd /data/grid/11.2.0/crf/db/host1 [root@host1 host1]# rm *.bdb rm: remove regular file `crfalert.bdb'? y rm: remove regular file `crfclust.bdb'? y rm: remove regular file `crfconn.bdb'? y rm: remove regular file `crfcpu.bdb'? y rm: remove regular file `crfhosts.bdb'? y rm: remove regular file `crfloclts.bdb'? y rm: remove regular file `crfts.bdb'? y rm: remove regular file `repdhosts.bdb'? y [root@host1 host1]# /data/grid/11.2.0/bin/crsctl start res ora.crf -init CRS-2672: Attempting to start 'ora.crf' on 'host1' CRS-2676: Start of 'ora.crf' on 'host1' succeeded [root@host1 host1]# /data/grid/11.2.0/bin/crsctl stat res ora.crf -init -t -------------------------------------------------------------------------------- NAME TARGET STATE SERVER STATE_DETAILS -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.crf 1 ONLINE ONLINE host1 [root@host1 host1]# du -sk 64732 . 6 第二天再次检查,还是发现有同样的告警,使用空间226M 2021-08-17 02:23:33.856: [crflogd(1019)]CRS-9520:The storage of Grid Infrastructure Management Repository is 91% full. The storage location is '/data/grid/11.2.0/crf/db/host1'. 2021-08-17 02:28:38.855: [crflogd(1019)]CRS-9520:The storage of Grid Infrastructure Management Repository is 92% full. The storage location is '/data/grid/11.2.0/crf/db/host1'. 2021-08-17 02:33:43.855: [crflogd(1019)]CRS-9520:The storage of Grid Infrastructure Management Repository is 93% full. The storage location is '/data/grid/11.2.0/crf/db/host1'. [root@host1 host1]# du -sk 226504 . 7 查看官方文件,可以关闭此服务,不影响集群的正常使用。如果此目录达到100%,根据网上查询结果,有可能导致集群异常,故果断关闭此服务,避免引起集群异常。以下为 Cluster Health Monitor (CHM) FAQ (Doc ID 1328466.1) Oracle官方文档的部分内容: What is the Cluster Health Monitor? The Cluster Health Monitor collects OS statistics (system metrics) such as memory and swap space usage, processes, IO usage, and network related data. The Cluster Health Monitor collects information in real time and usually once a second. The Cluster Health Monitor collects OS statistics using OS API to gain performance and reduce the CPU usage overhead. The Cluster Health Monitor collects as much of system metrics and data as feasible that is restricted by the acceptable level of resource consumption by the tool. What is the purpose of the Cluster Health Monitor? The Cluster Health Monitor is developed to provide system metrics and data for troubleshooting many different types of problems such as node reboot and hang, instance eviction and hang, severe performance degradation, and any other problems that need the system metrics and data. By monitoring the data constantly, users can use the Cluster Health Monitor detect potential problem areas s uch as CPU load, memory constraints, and spinning processes before the problem causes an unwanted outage. Is stop/start ora.crf affecting clusterware function or cluster database function?No, stop/start ora.crf resource will stop and start Cluster Health Monitor and its data collection, it will not affect clusterware or database functionality. How much of overhead does the Cluster Health Monitor cause?In today's server environment, the Cluster Health Monitor uses approximately less than 3% of the server's capacity for CPU. The overhead of using the Cluster Health Monitor is minimal. However. CHM on the server with large number of disks or IO devices and more CPUs/memory would use more CPU than CHM on a server that does not have many disks and CPUs/memory. How much of disk space is needed for the Cluster Health Monitor? The Cluster Health Monitor takes up 1GB space by default on all nodes in the cluster. The approximate amount of data collected is 0.5 GB per node per day. The size of the repository can increase to collect and save data up to 3 days, and this will increase the disk usage appropriately. How do I find out the size of data collected and saved by the Cluster Health Monitor in my system? “oclumon manage -get repsize” will show the size in seconds. To estimate the space required, use the following formula: # of nodes * 720MB * 3 = Size required for 3 days retention eg. for 4 node cluster: 4 * 720 * 3 = 8,640MB (8.4GB) How can I increase the size of the Cluster Health Monitor repository ? “oclumon manage -repos resize <number in seconds less than 259200>”. Setting the value to 259200 will collect and save the data for 72 hours (3 days). It is recommended to set 72 hours of retention based on above formula. This space needs to be available on all node in the cluster. Please resize the repositories or moving them if necessary in order to achieve 72 hours of retention. 参考如下文档,可以调整CHM目录使用的大小,但由于此系统空间不足,故不做更改。 How to Relocate Cluster Health Monitor (CHM) Repository and Increase Retention Time (Doc ID 2062234.1) 11.2 In 11.2, the repository of CHM is in Grid home, to change the retention time: $ <GRID_HOME>/bin/oclumon manage -repos resize 259200racnode1 --> retention check successfulracnode2 --> retention check successfulNew retention is 259200 and will use 4525424640 bytes of disk spaceCRS-9115-Cluster Health Monitor repository size change completed on all nodes.DoneNote: the command line specifies for how many seconds to retain the data and it's recommended to be at least 259200 which is 3 days. In case there's insufficient amount of space in Grid home, relocate CHM data with the following command: $ <GRID_HOME>/bin/oclumon manage -repos reploc /home/grid/chmracnode1 --> Ready to commit new locationracnode2 --> Ready to commit new locationNew retention is 259200 and will use 4525424640 bytes of disk spaceCRS-9113-Cluster Health Monitor repository location change completed on all nodes. Restarting Loggerd. 8 关闭CRF[root@host1 11.2.0]# /data/grid/11.2.0/bin/crsctl stop res ora.crf -initCRS-2673: Attempting to stop 'ora.crf' on 'host1'CRS-2677: Stop of 'ora.crf' on 'host1' succeeded[root@host1 11.2.0]# /data/grid/11.2.0/bin/crsctl stat res ora.crf -init -t --------------------------------------------------------------------------------NAME TARGET STATE SERVER STATE_DETAILS --------------------------------------------------------------------------------Cluster Resources--------------------------------------------------------------------------------ora.crf 1 OFFLINE OFFLINE
CRS-9520:The storage of Grid Infrastructure Management Repository is 93% full
来源:这里教程网
时间:2026-03-03 16:53:04
作者:
编辑推荐:
- CRS-9520:The storage of Grid Infrastructure Management Repository is 93% full03-03
- 【TUNE_ORACLE】Oracle检查点(一)检查点(Checkpoint)概念介绍03-03
- ORACLE 11.2.0.4 RAC RMAN异机恢复之ORA-1500103-03
- 【TUNE_ORACLE】Oracle检查点(二)检查点性能03-03
- 【大页内存】Oracle数据库配置大页内存03-03
- 目标端未禁用触发器导致的ORA-04088和ORA-01400错误03-03
- 新媒体运营周报怎么写?这样做自媒体数据分析,老板一定夸你!03-03
- 【TUNE_ORACLE】Oracle检查点(三)增量检查点四个关键参数介绍03-03
下一篇:
相关推荐
-
雷神推出 MIX PRO II 迷你主机:基于 Ultra 200H,玻璃上盖 + ARGB 灯效
2 月 9 日消息,雷神 (THUNDEROBOT) 现已宣布推出基于英
-
制造商 Musnap 推出彩色墨水屏电纸书 Ocean C:支持手写笔、第三方安卓应用
2 月 10 日消息,制造商 Musnap 现已在海外推出一款 Oce
热文推荐
- 新媒体运营周报怎么写?这样做自媒体数据分析,老板一定夸你!
新媒体运营周报怎么写?这样做自媒体数据分析,老板一定夸你!
26-03-03 - rac环境中数据文件权限不对导致的ORA-600和数据库hang
rac环境中数据文件权限不对导致的ORA-600和数据库hang
26-03-03 - 自媒体运营报告怎么写?周报月报撰写方法
自媒体运营报告怎么写?周报月报撰写方法
26-03-03 - 新媒体运营数据分析工具有哪些?
新媒体运营数据分析工具有哪些?
26-03-03 - 【DATAGUARD】Oracle Dataguard体系架构详解
【DATAGUARD】Oracle Dataguard体系架构详解
26-03-03 - 新媒体运营数据分析必备工具,提升技能一定要学!
新媒体运营数据分析必备工具,提升技能一定要学!
26-03-03 - 家装行业为什么开发小程序?设计装修类小程序怎么做?
家装行业为什么开发小程序?设计装修类小程序怎么做?
26-03-03 - 【INDEX】Oracle分区索引技术详解
【INDEX】Oracle分区索引技术详解
26-03-03 - 某业务系统的监听每过10天左右,就异常终止一次TNS-12537
某业务系统的监听每过10天左右,就异常终止一次TNS-12537
26-03-03 - 云村,网易云音乐的扛把子?
云村,网易云音乐的扛把子?
26-03-03
