在MySQL中开启并行复制后,观察到延时不会是0,并且也没有什么大事务。很多文章已经有总结了,这里记录下。并行以及非并行下的复制延时的计算方式都是下面的代码
{
long time_diff= ((long)(time(0) - mi->rli->last_master_timestamp)
- mi->clock_diff_with_master);
protocol->store((longlong)(mi->rli->last_master_timestamp ? max(0L, time_diff) : 0));}
不同的就是last_master_timestamp 设置,在非并行或并行情况下 last_master_timestamp==0的情况下,( last_master_timestamp==0的情况出现在gaq队列为空的场景) 这个值的设置如下,在执行relay_log_event的时候设置
rli->last_master_timestamp= ev->common_header->when.tv_sec + (time_t) ev->exec_time;
是binlog log_event_header 的时间 + event执行的时间 那在并行复制的情况下 last_master_timestamp 值的设置是在函数mts_checkpoint_routine中设置,这个函数是执行checkpoint,处理gaq头任务,获取lwm
/* Update the rli->last_master_timestamp for reporting correct Seconds_behind_master. If GAQ is empty, set it to zero. Else, update it with the timestamp of the first job of the Slave_job_queue which was assigned in the Log_event::get_slave_worker() function. */ ts= rli->gaq->empty() ? 0 : reinterpret_cast<Slave_job_group*>(rli->gaq->head_queue())->ts; rli->reset_notified_checkpoint(cnt, ts, need_data_lock, true); /* end-of "Coordinator::"commit_positions" */
在gaq空的情况下设置成0 ,否则设置成Slave_job_queue 第一个job的时间 函数 mts_checkpoint_routine 是在next_event中调用,根据checkpoint_group 和mts_checkpoint_period参数判断是否执行 mts_checkpoint_routine
bool force= (rli->checkpoint_seqno > (rli->checkpoint_group - 1));
if (rli->is_parallel_exec() && (opt_mts_checkpoint_period != 0 || force))
{
ulonglong period= static_cast<ulonglong>(opt_mts_checkpoint_period * 1000000ULL);
mysql_mutex_unlock(&rli->data_lock);
/*
At this point the coordinator has is delegating jobs to workers and
the checkpoint routine must be periodically invoked.
*/
(void) mts_checkpoint_routine(rli, period, force, true/*need_data_lock=true*/); // TODO: ALFRANIO ERROR
DBUG_ASSERT(!force ||
(force && (rli->checkpoint_seqno <= (rli->checkpoint_group - 1))) ||
sql_slave_killed(thd, rli));
mysql_mutex_lock(&rli->data_lock);
}
如果间隔小,就不执行checkpoint,不更新 last_master_timestamp
if (!force && diff < period)
{
/*
We do not need to execute the checkpoint now because
the time elapsed is not enough.
*/
DBUG_RETURN(FALSE);
}
如果checkpoint没有做,延误了,导致event没有及时处理,那么这个last_master_timestamp就会相对旧,导致出现延时的情况。
