故障描述
某次,一套运行已经有点岁月的 O racle 11 g R2 单机环境,用户反馈某个物化视图不会定时自动刷新,进一步了解后,确认是 D BMS_JOB 无法按照指定的时间和间隔执行,根据以往的运维经验,估计大概率是触发 J OB 自动执行的条件不具备导致,该问题是比较容易处理。但实际情况并没有想的那么简单,详情如下。
根因分析
1
、对异常
2
17
J
OB
初步分析与处理
创建多个物化视图,但没有一个能定时自动触发
选择
2
17
号
J
OB
,修改其下一次执行时间和间隔
修改
2
17 JOB
下一次执行时间
next_date
begin
dbms_job.next_date(217,to_date('2022-12-07 17:35:00','yyyy-mm-dd hh24:mi:ss'));
commit;
end;
/
修改 2 17 JOB 执行间隔 interval
begin dbms_job.interval(217,interval => 'sysdate+10/1440'); commit; end; /
删除 2 17 JOB ,并重建
exec dbms_job.remove(217);
commit;
declare job2 number;
begin
dbms_job.submit(job2,'dbms_mview.refresh(''"PMS"."MV_HEX_PMS_USER_XBRL"'');',sysdate,'sysdate+10/1440');
commit;
end;
/
重建之后的 2 19 号 J OB 依旧没有执行
select job, what, last_date, next_date, interval, broken, failures, schema_user, log_user, priv_user from dba_jobs WHERE JOB = 219;
2
、模拟用户场景,在测试环境上进行验证
创建业务表
znfx_agg_zc_GT_bb
create table xuh.znfx_agg_zc_GT_bb( MONTHID varchar2(20), ZCZT varchar2(20), QYLX varchar2(20), CLRQ varchar2(20), SFXS varchar2(20), NUM number(20) , ZCZB number(20) , COUNT number(20) , NUM_CNT number(20) , ZCZB_CNT number(20) );
创建存储过程 znfx_tab_pro
create or replace procedure xuh.znfx_tab_pro(arg1 in varchar2) is v_count number := 1; begin while v_count <= 10000 loop insert into xuh.znfx_agg_zc_GT_bb values (arg1, 'AAA', 'BBB', 'CCC', 'DDD', v_count, v_count, v_count, v_count, v_count); v_count := v_count + 1; end loop; commit; end; /
执行存储过程,模拟数据插入
SQL> exec xuh.znfx_tab_pro('202212');
创建物化视图 JCD_AGG_ZC_GT_BB_MV
CREATE MATERIALIZED VIEW "XUH"."JCD_AGG_ZC_GT_BB_MV" ("MONTHID", "ZCZT", "QYLX", "CLRQ", "SFXS", "NUM", "ZCZB", "COUNT", "NUM_CNT", "ZCZB_CNT")
REFRESH force ON DEMAND
AS
select monthid,
zczt,
qylx,
clrq,
sfxs,
sum(num) num,
sum(zczb) zczb,
count(*) count,
Count(num) num_cnt,
Count(zczb) zczb_cnt
from xuh.znfx_agg_zc_GT_bb
group by monthid, zczt, qylx, clrq, sfxs;
手动执行物化视图刷新,确保刷新没有问题
SQL> exec dbms_mview.refresh('XUH.JCD_AGG_ZC_GT_BB_MV');
创建 J OB ,并跟踪观察 J OB 执行情况
declare job3 number;
begin
dbms_job.submit(job3,'dbms_mview.refresh(''"XUH"."JCD_AGG_ZC_GT_BB_MV"'');',sysdate,'sysdate+1/1440');
commit;
end;
/
可以看到该
J
OB
确实可以定时自动成功执行
3
、排查可能造成
J
OB
无法定时自动执行的原因
Check the most common reasons why jobs don't execute automatically and as scheduled:
1) Instance in RESTRICTED SESSIONS mode?
Check if the instance is in restricted sessions mode:
select instance_name,logins from v$instance;
If logins=RESTRICTED, then:
alter system disable restricted session;
^-- Checked!
2) JOB_QUEUE_PROCESSES=0
Make sure that job_queue_processes is > 0
show parameter job_queue_processes
^-- Checked!
3) _SYSTEM_TRIG_ENABLED=FALSE
Check if _system_enabled_trigger=false
col parameter format a25
col value format a15
select a.ksppinm parameter,b.ksppstvl value from x$ksppi a,x$ksppcv b
where a.indx=b.indx and ksppinm='_system_trig_enabled';
If _system_trig_enabled=false, then
alter system set "_system_trig_enabled"=TRUE scope=both;
^-- Checked!
4) Is the job BROKEN?
select job,broken from dba_jobs where job=<job_number>;
If broken, then check the alert log and trace files to diagnose the issue.
^-- Checked! The job is not broken.
5) Is the job COMMITted?
Make sure a commit is issued after submitting the job:
DECLARE X NUMBER;
BEGIN
SYS.DBMS_JOB.SUBMIT
(
job => X
,what => 'dbms_utility.analyze_schema
(''SCOTT'',''COMPUTE'',NULL,NULL,NULL);'
,next_date => to_date('08/06/2005 09:35:00','dd/mm/yyyy hh24:mi:ss')
,no_parse => FALSE
);
COMMIT;
END;
/
If the job executes fine if forced (i.e., exec dbms_jobs.run(<job_no>);), then likely a commit
is missing.
^-- Checked! The job is committed after submission.
6) UPTIME > 497 days
Check if the server (machine) has been up for more than 497 days:
For SUN, use 'uptime' OS command.
If uptime>497 and the jobs do not execute automatically, then you are hitting unpublished bug 3427424
(Jobs may stop running after 497 days uptime) which is fixed in 9206 and A102
^-- Checked! The server in this case has been up 126 days only
7) DBA_JOBS_RUNNING
Check dba_jobs_running to see if the job is still running:
select * from dba_jobs_running;
^-- Checked! The job is not running.
8) LAST_DATE and NEXT_DATE
Check if the last_date and next_date for the job are proper:
select Job,Next_date,Last_date from dba_jobs where job=<job_number>;
^-- NEXT_DATE is proper, however LAST_DATE is null since the job never executes automatically.
9) NEXT_DATE and INTERVAL
Check if the Next_date is changing properly as per the interval set in dba_jobs:
select Job,Interval,Next_date,Last_date from dba_jobs where job=<job_number>;
^-- This is not possible since the job never gets executed automatically.
10) Toggle value for JOB_QUEUE_PROCESSES
Stop and restart CJQ process(es)
alter system set job_queue_processes=0 ;
--<Wait for some time to ensure CJQ process stopped>
alter system set job_queue_processes=4 ;
Ref: Bug 2649244 (fixed by: 9015, 9203, 10201)
^-- Done but did not help
11) Check for unsuccessful shutdowns:
A shutdown immediate may get canceled because active sessions prevent the database close operation.
Please review the alert log for the last two shutdown / startups and the messages:
SHUTDOWN: Active sessions prevent database close operation
Please refer to
Note 434690.1 - Database Jobs Do Not Run After a Failed 'Shutdown Immediate'
12) DBMS_IJOB (non-documented package):
Either restart the database or try the following:
exec dbms_ijob.set_enabled(true);
Ref: Bug 3505718 (Closed, Not a Bug)
^-- Done but did not help
13) Check view DBA_SCHEDULER_GLOBAL_ATTRIBUTE for CURRENT_OPEN_WINDOW:
SQL> select * from DBA_SCHEDULER_GLOBAL_ATTRIBUTE where attribute_name='CURRENT_OPEN_WINDOW';
If a window is open close it (e.g.):
ATTRIBUTE_NAME VALUE
--------------------------------- ----------------------------
CURRENT_OPEN_WINDOW WEEKNIGHT_WINDOW
SQL> exec DBMS_SCHEDULER.close_window ('WEEKNIGHT_WINDOW');
^-- Done but did not help
These are the most common causes for this behavior.
通过分析
M
OS
文章
Jobs Not Executing Automatically (Doc ID 313102.1)
,
检查所有的可能原因,并根据官方建议调整,但是问题依旧。此时,想短时间通过
D
BMS_JOB
方式解决问题不太现实,因此尝试使用高版本官方建议的
DBMS_SCHEDULER
,确认可行性。
4
、使用
DBMS_SCHEDULER
创建任务,观察任务执行情况
创建
J
OB2
,并立即执行
BEGIN
DBMS_SCHEDULER.CREATE_JOB(
job_name => 'job2',
job_type => 'PLSQL_BLOCK',
job_action => 'dbms_mview.refresh(''"PMS"."MV_HEX_PMS_USER_XBRL"'');',
start_date => SYSDATE,
repeat_interval => 'FREQ = MINUTELY; INTERVAL = 10',
enabled => true);
END;
/
检查 DBA_SCHEDULER_JOBS 字典视图,发现任务没有自动执行
select OWNER, JOB_NAME, ENABLED, to_char(START_DATE, 'yyyy-mm-dd hh24:mi:ss') START_DATE, REPEAT_INTERVAL, RUN_COUNT, to_char(LAST_START_DATE, 'yyyy-mm-dd hh24:mi:ss') LAST_START_DATE, to_char(LAST_RUN_DURATION, 'yyyy-mm-dd hh24:mi:ss') LAST_RUN_DURATION, to_char(NEXT_RUN_DATE, 'yyyy-mm-dd hh24:mi:ss') NEXT_RUN_DATE from DBA_SCHEDULER_JOBS where job_name = 'JOB2'; [object Object]
select * from dba_scheduler_job_log t where owner='PMS' and job_name='JOB2';
5
、创建测试任务
test
_job
,并重启
D
BMS_JOB&
DBMS_SCHEDULER
属性
创建测试
test_job
任务
begin dbms_scheduler.create_job ( job_name => 'test_job', job_type => 'plsql_block', job_action => 'null;', start_date => SYSDATE, repeat_interval => 'FREQ = MINUTELY; INTERVAL = 1', enabled => true); end; /
检查 D BMS_JOB& DBMS_SCHEDULER 属性生效情况
select value from dba_scheduler_global_attribute where attribute_name='SCHEDULER_DISABLED'; 未选定行 SQL> oradebug setmypid SQL> oradebug dumpvar sga kkjsre sword kkjsre_ [060040500, 060040504) = 00000001
先禁用任务属性,再重新触发生效
exec dbms_ijob.set_enabled(FALSE);
exec dbms_ijob.set_enabled(TRUE);
alter system set job_queue_processes=0;
alter system set job_queue_processes=500;
--用户环境重启DBMS_SCHEDULER属性报错分析
exec dbms_scheduler.set_scheduler_attribute('SCHEDULER_DISABLED', 'TRUE');
--exec dbms_scheduler.set_scheduler_attribute('SCHEDULER_DISABLED', 'FALSE');
关闭
DBMS_SCHEDULER.SET_SCHEDULER_ATTRIBUTE
属性,命中
O
RA-1033
报错
通过分析
M
OS
文章
DBMS_SCHEDULER.SET_SCHEDULER_ATTRIBUTE Failing With ORA-01033. (Doc ID 756740.1)
,官方未给出明确的故障原因,等待一会之后重试成功,同时观测到
DBMS_SCHEDULER
创建的任务正常定时自动执行。
这跟其他运行正常的
Oracle
相比有些异常,甚至是相反,但不排除是这套环境自身的原因,甚至是
B
UG
。其实,官方原文中也有提到
Oracle
文档,
并
没有支持完全禁用调度器的方法
,只是确保在升级模式下,自动任务不会执行。
此时检查任务
test_job
执行情况,发现该任务已经在自动调度了
select OWNER, JOB_NAME, ENABLED, to_char(START_DATE, 'yyyy-mm-dd hh24:mi:ss') START_DATE, REPEAT_INTERVAL, RUN_COUNT, to_char(LAST_START_DATE, 'yyyy-mm-dd hh24:mi:ss') LAST_START_DATE, to_char(LAST_RUN_DURATION, 'yyyy-mm-dd hh24:mi:ss') LAST_RUN_DURATION, to_char(NEXT_RUN_DATE, 'yyyy-mm-dd hh24:mi:ss') NEXT_RUN_DATE from DBA_SCHEDULER_JOBS where job_name = 'TEST_JOB';
同理,此时业务任务
job
2
,同样已经在自动调度了,如下:
select OWNER, JOB_NAME, ENABLED, to_char(START_DATE, 'yyyy-mm-dd hh24:mi:ss') START_DATE, REPEAT_INTERVAL, RUN_COUNT, to_char(LAST_START_DATE, 'yyyy-mm-dd hh24:mi:ss') LAST_START_DATE, to_char(LAST_RUN_DURATION, 'yyyy-mm-dd hh24:mi:ss') LAST_RUN_DURATION, to_char(NEXT_RUN_DATE, 'yyyy-mm-dd hh24:mi:ss') NEXT_RUN_DATE from DBA_SCHEDULER_JOBS where job_name = 'JOB2';
select * from dba_scheduler_job_log t where owner='PMS' and job_name='JOB2';
解决办法
SQL> exec dbms_scheduler.set_scheduler_attribute('SCHEDULER_DISABLED', 'TRUE');
编辑推荐:
- Oracle无法自动调度DBMS_JOB&DBMS_SCHEDULER案例分析03-03
- Oracle、MySQL常见表结构变更语句对比03-03
- 【NUMBER】Oracle数据库优化之理解NUMBER存储机制03-03
- 19c ADG报错Error 1094 attaching to RFS for reconnect03-03
- 基于Linux的oracle 12cR2 RAC 标准化安装(三)03-03
- 基于Linux的oracle 12cR2 RAC 标准化安装(四)03-03
- Oracle Critical Patch Update for October 202203-03
- Oracle sql 语句中带有特殊的字符处理03-03
相关推荐
-
雷神推出 MIX PRO II 迷你主机:基于 Ultra 200H,玻璃上盖 + ARGB 灯效
2 月 9 日消息,雷神 (THUNDEROBOT) 现已宣布推出基于英
-
制造商 Musnap 推出彩色墨水屏电纸书 Ocean C:支持手写笔、第三方安卓应用
2 月 10 日消息,制造商 Musnap 现已在海外推出一款 Oce
