CRS-9520:The storage of Grid Infrastructure Management Repository is 93% full

来源:这里教程网 时间:2026-03-03 16:53:04 作者:

1 巡检数据库发现如下告警:2021-08-16 02:13:59.207: [crflogd(6575)]CRS-9520:The storage of Grid Infrastructure Management Repository is 92% full. The storage location is '/data/grid/11.2.0/crf/db/host1'.2021-08-16 02:19:04.207: [crflogd(6575)]CRS-9520:The storage of Grid Infrastructure Management Repository is 93% full. The storage location is '/data/grid/11.2.0/crf/db/host1'. 2 查看主机目录使用情况,磁盘目录使用正常[root@host1 ~]# df -hFilesystem                    Size  Used Avail Use% Mounted on/dev/mapper/VolGroup-lv_root   50G   12G   36G  24% /tmpfs                         7.9G  297M  7.6G   4% /dev/shm/dev/sda2                     485M   62M  398M  14% /boot/dev/sda1                     200M  260K  200M   1% /boot/efi/dev/mapper/VolGroup-lv_home   77G   47G   27G  64% /home 3 查看集群状态正常 [grid@host1 host1]$ crsctl stat res -t -------------------------------------------------------------------------------- NAME           TARGET  STATE        SERVER                   STATE_DETAILS        -------------------------------------------------------------------------------- Local Resources -------------------------------------------------------------------------------- ora.ARCHIVELOG.dg                ONLINE  ONLINE       host1                                                      ONLINE  ONLINE       host2                                       ora.DATA.dg                ONLINE  ONLINE       host1                                                      ONLINE  ONLINE       host2                                       ora.FLASHBAK.dg                ONLINE  ONLINE       host1                                                      ONLINE  ONLINE       host2                                       ora.LISTENER.lsnr                ONLINE  ONLINE       host1                                                      ONLINE  ONLINE       host2                                       ora.DATA1.dg                ONLINE  ONLINE       host1                                                      ONLINE  ONLINE       host2                                       ora.DATA2.dg                ONLINE  ONLINE       host1                                                      ONLINE  ONLINE       host2                                       ora.asm                ONLINE  ONLINE       host1                  Started                             ONLINE  ONLINE       host2                  Started              ora.gsd                OFFLINE OFFLINE      host1                                                      OFFLINE OFFLINE      host2                                       ora.net1.network                ONLINE  ONLINE       host1                                                      ONLINE  ONLINE       host2                                       ora.ons                ONLINE  ONLINE       host1                                                      ONLINE  ONLINE       host2                                       ora.registry.acfs                ONLINE  ONLINE       host1                                                      ONLINE  ONLINE       host2                                       -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.LISTENER_SCAN1.lsnr       1        ONLINE  ONLINE       host1                                       ora.cvu       1        ONLINE  ONLINE       host1                                       ora.oc4j       1        OFFLINE OFFLINE                                                    ora.scan1.vip       1        ONLINE  ONLINE       host1                                       ora.ywtest.db       1        ONLINE  ONLINE       host2                  Open                       2        ONLINE  ONLINE       host1                  Open                 ora.host1.vip       1        ONLINE  ONLINE       host1                                       ora.host2.vip       1        ONLINE  ONLINE       host2               4 查看CRF资源的状态,为ONLINE ,使用空间为290M左右,使用的不多              [grid@host1 host1]$ crsctl stat res ora.crf -init -t -------------------------------------------------------------------------------- NAME           TARGET  STATE        SERVER                   STATE_DETAILS        -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.crf       1        ONLINE  ONLINE       host1     查看CRF相关的进程             [grid@host1 host1]$ ps -ef|grep osysmond root      5329     1  4 Apr10 ?        5-10:43:26 /data/grid/11.2.0/bin/osysmond.bin grid     15537 12181  0 14:43 pts/25   00:00:00 grep osysmond [grid@host1 host1]$ ps -ef|grep ologgerd root      6575     1  1 Apr10 ?        2-08:01:02 /data/grid/11.2.0/bin/ologgerd -m host2 -r -d  /data/grid/11.2.0/crf/db/host1 grid     16077 12181  0 14:45 pts/25   00:00:00 grep ologgerd     查看CRF对应的目录,发现使用的不多,总共就290M [root@host1 host1]# du -k  290396  . 使用如下命令,查看CRF 采集数据的时间,单位为秒,17个小时,时间不长。 [grid@host1 ~]$ oclumon query>  manage -get repsize CHM Repository Size = 61511  Done  5 使用如下方法清理目录 检查ora.crf状态   /data/grid/11.2.0/bin/crsctl stat res ora.crf -init -t     停止ora.crf   /data/grid/11.2.0/bin/crsctl stop res ora.crf -init  检查ora.crf状态   /data/grid/11.2.0/bin/crsctl stat res ora.crf -init -t     删除日志: rm crf*.bdb 启动ora.crf  /data/grid/11.2.0/bin/crsctl start res ora.crf -init    [root@host1 /]# /data/grid/11.2.0/bin/crsctl stat res ora.crf -init -t  -------------------------------------------------------------------------------- NAME           TARGET  STATE        SERVER                   STATE_DETAILS        -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.crf       1        ONLINE  ONLINE       host1                                       [root@host1 /]# /data/grid/11.2.0/bin/crsctl stop res ora.crf -init CRS-2673: Attempting to stop 'ora.crf' on 'host1' CRS-2677: Stop of 'ora.crf' on 'host1' succeeded [root@host1 host1]# pwd /data/grid/11.2.0/crf/db/host1 [root@host1 host1]# rm *.bdb rm: remove regular file `crfalert.bdb'? y rm: remove regular file `crfclust.bdb'? y rm: remove regular file `crfconn.bdb'? y rm: remove regular file `crfcpu.bdb'? y rm: remove regular file `crfhosts.bdb'? y rm: remove regular file `crfloclts.bdb'? y rm: remove regular file `crfts.bdb'? y rm: remove regular file `repdhosts.bdb'? y [root@host1 host1]# /data/grid/11.2.0/bin/crsctl start res ora.crf -init   CRS-2672: Attempting to start 'ora.crf' on 'host1' CRS-2676: Start of 'ora.crf' on 'host1' succeeded [root@host1 host1]# /data/grid/11.2.0/bin/crsctl stat res ora.crf -init -t   -------------------------------------------------------------------------------- NAME           TARGET  STATE        SERVER                   STATE_DETAILS        -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.crf       1        ONLINE  ONLINE       host1           [root@host1 host1]# du -sk 64732   .   6   第二天再次检查,还是发现有同样的告警,使用空间226M 2021-08-17 02:23:33.856:  [crflogd(1019)]CRS-9520:The storage of Grid Infrastructure Management Repository is 91% full.  The storage location is '/data/grid/11.2.0/crf/db/host1'. 2021-08-17 02:28:38.855:  [crflogd(1019)]CRS-9520:The storage of Grid Infrastructure Management Repository is 92% full.  The storage location is '/data/grid/11.2.0/crf/db/host1'. 2021-08-17 02:33:43.855:  [crflogd(1019)]CRS-9520:The storage of Grid Infrastructure Management Repository is 93% full.  The storage location is '/data/grid/11.2.0/crf/db/host1'. [root@host1 host1]# du -sk 226504  . 7 查看官方文件,可以关闭此服务,不影响集群的正常使用。如果此目录达到100%,根据网上查询结果,有可能导致集群异常,故果断关闭此服务,避免引起集群异常。以下为 Cluster Health Monitor (CHM) FAQ (Doc ID 1328466.1)  Oracle官方文档的部分内容: What is the Cluster Health Monitor? The Cluster Health Monitor collects OS statistics (system metrics) such as memory and swap space usage,  processes, IO usage, and network related data. The Cluster Health Monitor collects information in real time  and usually once a second. The Cluster Health Monitor collects OS statistics using OS API to gain performance  and reduce the CPU usage overhead. The Cluster Health Monitor collects as much of system metrics  and data as feasible that is restricted by the acceptable level of resource consumption by the tool. What is the purpose of the Cluster Health Monitor? The Cluster Health Monitor is developed to provide system metrics and data for troubleshooting many different  types of problems such as node reboot and hang, instance eviction and hang, severe performance degradation,  and any other problems that need the system metrics and data. By monitoring the data constantly, users can use the Cluster Health Monitor detect potential problem areas  s uch as CPU load, memory constraints, and spinning processes before the problem causes an unwanted outage. Is stop/start ora.crf affecting clusterware function or cluster database function?No, stop/start ora.crf resource will stop and start Cluster Health Monitor and its data collection, it will not affect clusterware or database functionality. How much of overhead does the Cluster Health Monitor cause?In today's server environment, the Cluster Health Monitor uses approximately less than 3% of the server's capacity for CPU. The overhead of using the Cluster Health Monitor is minimal.  However. CHM on the server with large number of disks or IO devices and more CPUs/memory would use more CPU than CHM on a server that does not have many disks and CPUs/memory. How much of disk space is needed for the Cluster Health Monitor? The Cluster Health Monitor takes up 1GB space by default on all nodes in the cluster. The approximate amount of  data collected is 0.5 GB per node per day. The size of the repository can increase to collect and save data up to 3  days, and this will increase the disk usage appropriately. How do I find out the size of data collected and saved by the Cluster Health Monitor in my system? “oclumon manage -get repsize” will show the size in seconds. To estimate the space required, use the following formula: # of nodes * 720MB * 3 = Size required for 3 days retention  eg. for 4 node cluster: 4 * 720 * 3 = 8,640MB (8.4GB) How can I increase the size of the Cluster Health Monitor repository ? “oclumon manage -repos resize <number in seconds less than 259200>”. Setting the value to 259200 will  collect and save the data for 72 hours (3 days). It is recommended to set 72 hours of retention based on above  formula. This space needs to be available on all node in the cluster. Please resize the repositories or moving them  if necessary in order to achieve 72 hours of retention. 参考如下文档,可以调整CHM目录使用的大小,但由于此系统空间不足,故不做更改。 How to Relocate Cluster Health Monitor (CHM) Repository and Increase Retention Time (Doc ID 2062234.1) 11.2 In 11.2, the repository of CHM is in Grid home, to change the retention time:  $ <GRID_HOME>/bin/oclumon manage -repos resize 259200racnode1 --> retention check successfulracnode2 --> retention check successfulNew retention is 259200 and will use 4525424640 bytes of disk spaceCRS-9115-Cluster Health Monitor repository size change completed on all nodes.DoneNote: the command line specifies for how many seconds to retain the data and it's recommended to be at least 259200 which is 3 days.  In case there's insufficient amount of space in Grid home, relocate CHM data with the following command: $ <GRID_HOME>/bin/oclumon manage -repos reploc /home/grid/chmracnode1 --> Ready to commit new locationracnode2 --> Ready to commit new locationNew retention is 259200 and will use 4525424640 bytes of disk spaceCRS-9113-Cluster Health Monitor repository location change completed on all nodes. Restarting Loggerd. 8 关闭CRF[root@host1 11.2.0]# /data/grid/11.2.0/bin/crsctl stop res ora.crf -initCRS-2673: Attempting to stop 'ora.crf' on 'host1'CRS-2677: Stop of 'ora.crf' on 'host1' succeeded[root@host1 11.2.0]# /data/grid/11.2.0/bin/crsctl stat res ora.crf -init -t --------------------------------------------------------------------------------NAME           TARGET  STATE        SERVER                   STATE_DETAILS       --------------------------------------------------------------------------------Cluster Resources--------------------------------------------------------------------------------ora.crf      1        OFFLINE OFFLINE        

相关推荐