一、问题
这个问题大概是这样。当前主从状态如下有2个半同步从库,如果停掉1个半同步从库是是否会堵塞主库,堵塞时间大概是多久?
mysql> show global status like '%semi%';+--------------------------------------------+-----------+| Variable_name | Value |+--------------------------------------------+-----------+| Rpl_semi_sync_master_clients | 2 || Rpl_semi_sync_master_net_avg_wait_time | 0 || Rpl_semi_sync_master_net_wait_time | 0 || Rpl_semi_sync_master_net_waits | 8454 || Rpl_semi_sync_master_no_times | 9 || Rpl_semi_sync_master_no_tx | 1597 || Rpl_semi_sync_master_status | ON || Rpl_semi_sync_master_timefunc_failures | 0 || Rpl_semi_sync_master_tx_avg_wait_time | 77630 || Rpl_semi_sync_master_tx_wait_time | 169312495 || Rpl_semi_sync_master_tx_waits | 2181 || Rpl_semi_sync_master_wait_pos_backtraverse | 0 || Rpl_semi_sync_master_wait_sessions | 0 || Rpl_semi_sync_master_yes_tx | 2181 || Rpl_semi_sync_slave_status | OFF |+--------------------------------------------+-----------+15 rows in set (0.01 sec)mysql> show variables like '%semi%';+-------------------------------------------+---------------------+| Variable_name | Value |+-------------------------------------------+---------------------+| rpl_semi_sync_master_enabled | ON || rpl_semi_sync_master_timeout | 1000000000000000000 || rpl_semi_sync_master_trace_level | 32 || rpl_semi_sync_master_wait_for_slave_count | 2 || rpl_semi_sync_master_wait_no_slave | OFF || rpl_semi_sync_master_wait_point | AFTER_SYNC || rpl_semi_sync_slave_enabled | ON || rpl_semi_sync_slave_trace_level | 32 |+-------------------------------------------+---------------------+
二、分析
首先从设置中这里,
而从当前的状态来看,
这种情况下虽然rpl_semi_sync_master_timeout设置了很大值,但是如果2个半同步中一个关闭后,是否需要等待rpl_semi_sync_master_timeout设置的值呢?其实这就和 rpl_semi_sync_master_wait_no_slave 参数有关,因为设置为off,那么当半同步的dump线程退出的时候那么会判断当前半同步的数量是否满足rpl_semi_sync_master_wait_for_slave_count 参数设置,如果不满足则马上降级为异步复制(回调函数repl_semi_binlog_dump_end) 。
但是dump线程的退出并不是及时的,需要等到一次心跳超时再次发送心跳的时候发现无法发送才会停止dump线程,而线上繁忙的系统,通常DUMP线程不会处于等待心跳发送的情况下的,但是一但回话线程在等待ACK上堵塞,那么binlog就不会再更改,dump线程也就没有事可做,只能触发心跳发送,但是发送心跳前会等待一个心跳周期为slave_net_timeout/2,如下,
Binlog_sender::wait_with_heartbeatdo
{
set_timespec_nsec(&ts, m_heartbeat_period); //心跳周期
ret= mysql_bin_log.wait_for_update_bin_log(m_thd, &ts);//如果没有binlog生成,也会不会被提前唤醒,等待到超时
if (ret != ETIMEDOUT && ret != ETIME) //如果是正常收到 发送信号则返回 否则如果是超时则发送心跳event,进行心跳检测
break; //正常返回0 是超时返回ETIMEDOUT 继续循环...
if (send_heartbeat_event(log_pos)) //发送心跳
return 1;
} while (!m_thd->killed);
slave_net_timeout默认为60秒,因此心跳周期默认为是30秒的时间,也就是参数slave_net_timeout/2,也就是这种情况下,半同步降级可能来自2个方面,
因此此处 降级的时间大概应该是min(slave_net_timeout/2,rpl_semi_sync_master_timeout(微秒)),取一个小值。
当然如果rpl_semi_sync_master_wait_no_slave保持默认值ON,则必须等待rpl_semi_sync_master_timeout参数设置的大小才降级。
三、结论和测试截图
因此这里我们看到这个配置下(rpl_semi_sync_master_wait_no_slave=OFF,rpl_semi_sync_master_wait_for_slave_count =2),停掉一个从库实际上会影响到主库但是在slave_net_timeout/2超时后会降级,默认是30秒,也就是30秒降级,当然如果rpl_semi_sync_master_timeout是默认的10秒,那么优先使用rpl_semi_sync_master_timeout的时间作为降级标准。这种情况下,5.7 测试截图如下,
8.0测试如下,
四、超时栈
(gdb) bt #0 0x00007ffff7bcade2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0#1 0x00000000018a9cb7 in native_cond_timedwait (cond=0x2dc1100 <mysql_bin_log+1824>, mutex=0x2dc1048 <mysql_bin_log+1640>, abstime=0x7fffec165e10) at /opt/percona-server-locks-detail-5.7.22/include/thr_cond.h:129#2 0x00000000018aa01c in safe_cond_timedwait (cond=0x2dc1100 <mysql_bin_log+1824>, mp=0x2dc1020 <mysql_bin_log+1600>, abstime=0x7fffec165e10, file=0x2221bd8 "/opt/percona-server-locks-detail-5.7.22/sql/binlog.cc", line=8362) at /opt/percona-server-locks-detail-5.7.22/mysys/thr_cond.c:88#3 0x0000000001802a65 in my_cond_timedwait (cond=0x2dc1100 <mysql_bin_log+1824>, mp=0x2dc1020 <mysql_bin_log+1600>, abstime=0x7fffec165e10, file=0x2221bd8 "/opt/percona-server-locks-detail-5.7.22/sql/binlog.cc", line=8362) at /opt/percona-server-locks-detail-5.7.22/include/thr_cond.h:180#4 0x0000000001802f61 in inline_mysql_cond_timedwait (that=0x2dc1100 <mysql_bin_log+1824>, mutex=0x2dc1020 <mysql_bin_log+1600>, abstime=0x7fffec165e10, src_file=0x2221bd8 "/opt/percona-server-locks-detail-5.7.22/sql/binlog.cc", src_line=8362) at /opt/percona-server-locks-detail-5.7.22/include/mysql/psi/mysql_thread.h:1229#5 0x0000000001814919 in MYSQL_BIN_LOG::wait_for_update_bin_log (this=0x2dc09e0 <mysql_bin_log>, thd=0x7ffee40008c0, timeout=0x7fffec165e10) at /opt/percona-server-locks-detail-5.7.22/sql/binlog.cc:8362#6 0x0000000001839283 in Binlog_sender::wait_with_heartbeat (this=0x7fffec1665b0, log_pos=194) at /opt/percona-server-locks-detail-5.7.22/sql/rpl_binlog_sender.cc:652#7 0x00000000018391d2 in Binlog_sender::wait_new_events (this=0x7fffec1665b0, log_pos=194) at /opt/percona-server-locks-detail-5.7.22/sql/rpl_binlog_sender.cc:631#8 0x0000000001838e87 in Binlog_sender::get_binlog_end_pos (this=0x7fffec1665b0, log_cache=0x7fffec165fc0) at /opt/percona-server-locks-detail-5.7.22/sql/rpl_binlog_sender.cc:397#9 0x00000000018369c5 in Binlog_sender::send_binlog (this=0x7fffec1665b0, log_cache=0x7fffec165fc0, start_pos=123) at /opt/percona-server-locks-detail-5.7.22/sql/rpl_binlog_sender.cc:345#10 0x0000000001836383 in Binlog_sender::run (this=0x7fffec1665b0) at /opt/percona-server-locks-detail-5.7.22/sql/rpl_binlog_sender.cc:226#11 0x0000000001834074 in mysql_binlog_send (thd=0x7ffee40008c0, log_ident=0x7fffec1671e0 "", pos=4, slave_gtid_executed=0x7fffec166fb0, flags=0) at /opt/percona-server-locks-detail-5.7.22/sql/rpl_master.cc:414#12 0x0000000001833f40 in com_binlog_dump_gtid (thd=0x7ffee40008c0, packet=0x7ffee40063f1 "", packet_length=124) at /opt/percona-server-locks-detail-5.7.22/sql/rpl_master.cc:398#13 0x00000000015671d6 in dispatch_command (thd=0x7ffee40008c0, com_data=0x7fffec167d90, command=COM_BINLOG_DUMP_GTID) at /opt/percona-server-locks-detail-5.7.22/sql/sql_parse.cc:1705#14 0x00000000015655c5 in do_command (thd=0x7ffee40008c0) at /opt/percona-server-locks-detail-5.7.22/sql/sql_parse.cc:1021#15 0x00000000016a635c in handle_connection (arg=0x64b1d50) at /opt/percona-server-locks-detail-5.7.22/sql/conn_handler/connection_handler_per_thread.cc:312#16 0x00000000018ce0f6 in pfs_spawn_thread (arg=0x5d3bc20) at /opt/percona-server-locks-detail-5.7.22/storage/perfschema/pfs.cc:2190#17 0x00007ffff7bc6ea5 in start_thread () from /lib64/libpthread.so.0#18 0x00007ffff66008dd in clone () from /lib64/libc.so.6(gdb) c Continuing.2024-10-21T08:24:34.739778Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 8700ms. The settings might not be optimal. (flushed=0, during the time.)Breakpoint 3, repl_semi_binlog_dump_end (param=0x7fffec165ee0) at /opt/percona-server-locks-detail-5.7.22/plugin/semisync/semisync_master_plugin.cc:164164 bool semi_sync_slave= is_semi_sync_dump();(gdb) bt #0 repl_semi_binlog_dump_end (param=0x7fffec165ee0) at /opt/percona-server-locks-detail-5.7.22/plugin/semisync/semisync_master_plugin.cc:164#1 0x000000000148c6f6 in Binlog_transmit_delegate::transmit_stop (this=0x2d89fc0 <delegates_init()::transmit_mem>, thd=0x7ffee40008c0, flags=0) at /opt/percona-server-locks-detail-5.7.22/sql/rpl_handler.cc:729#2 0x0000000001836169 in Binlog_sender::cleanup (this=0x7fffec1665b0) at /opt/percona-server-locks-detail-5.7.22/sql/rpl_binlog_sender.cc:176#3 0x00000000018368b9 in Binlog_sender::run (this=0x7fffec1665b0) at /opt/percona-server-locks-detail-5.7.22/sql/rpl_binlog_sender.cc:312#4 0x0000000001834074 in mysql_binlog_send (thd=0x7ffee40008c0, log_ident=0x7fffec1671e0 "", pos=4, slave_gtid_executed=0x7fffec166fb0, flags=0) at /opt/percona-server-locks-detail-5.7.22/sql/rpl_master.cc:414#5 0x0000000001833f40 in com_binlog_dump_gtid (thd=0x7ffee40008c0, packet=0x7ffee40063f1 "", packet_length=124) at /opt/percona-server-locks-detail-5.7.22/sql/rpl_master.cc:398#6 0x00000000015671d6 in dispatch_command (thd=0x7ffee40008c0, com_data=0x7fffec167d90, command=COM_BINLOG_DUMP_GTID) at /opt/percona-server-locks-detail-5.7.22/sql/sql_parse.cc:1705#7 0x00000000015655c5 in do_command (thd=0x7ffee40008c0) at /opt/percona-server-locks-detail-5.7.22/sql/sql_parse.cc:1021#8 0x00000000016a635c in handle_connection (arg=0x64b1d50) at /opt/percona-server-locks-detail-5.7.22/sql/conn_handler/connection_handler_per_thread.cc:312#9 0x00000000018ce0f6 in pfs_spawn_thread (arg=0x5d3bc20) at /opt/percona-server-locks-detail-5.7.22/storage/perfschema/pfs.cc:2190#10 0x00007ffff7bc6ea5 in start_thread () from /lib64/libpthread.so.0#11 0x00007ffff66008dd in clone () from /lib64/libc.so.6
(gdb) thread 73[Switching to thread 73 (Thread 0x7fffd8161700 (LWP 1887))]#0 0x00007ffff7bcade2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0(gdb) bt
#0 0x00007ffff7bcade2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0#1 0x0000000004802623 in native_cond_timedwait (cond=0x7e0efa8 <mysql_bin_log+1096>, mutex=0xa2d21d8, abstime=0x7fffd815e280) at /newdata/mysql-8.0.23/include/thr_cond.h:100#2 0x000000000480298e in safe_cond_timedwait (cond=0x7e0efa8 <mysql_bin_log+1096>, mp=0xa2d21b0, abstime=0x7fffd815e280, file=0x61b1de8 "/newdata/mysql-8.0.23/sql/binlog.cc", line=7557)
at /newdata/mysql-8.0.23/mysys/thr_cond.cc:113#3 0x00000000043b8259 in my_cond_timedwait (cond=0x7e0efa8 <mysql_bin_log+1096>, mp=0x7e0ef48 <mysql_bin_log+1000>, abstime=0x7fffd815e280, file=0x61b1de8 "/newdata/mysql-8.0.23/sql/binlog.cc",
line=7557) at /newdata/mysql-8.0.23/include/thr_cond.h:147#4 0x00000000043b8496 in inline_mysql_cond_timedwait (that=0x7e0efa8 <mysql_bin_log+1096>, mutex=0x7e0ef48 <mysql_bin_log+1000>, abstime=0x7fffd815e280,
src_file=0x61b1de8 "/newdata/mysql-8.0.23/sql/binlog.cc", src_line=7557) at /newdata/mysql-8.0.23/include/mysql/psi/mysql_cond.h:241#5 0x00000000043cb6a6 in MYSQL_BIN_LOG::wait_for_update (this=0x7e0eb60 <mysql_bin_log>, timeout=0x7fffd815e280) at /newdata/mysql-8.0.23/sql/binlog.cc:7557#6 0x000000000448acaf in Binlog_sender::wait_with_heartbeat (this=0x7fffd815eb60, log_pos=316133) at /newdata/mysql-8.0.23/sql/rpl_binlog_sender.cc:796#7 0x0000000004489023 in Binlog_sender::wait_new_events (this=0x7fffd815eb60, log_pos=316133) at /newdata/mysql-8.0.23/sql/rpl_binlog_sender.cc:778#8 0x0000000004488502 in Binlog_sender::get_binlog_end_pos (this=0x7fffd815eb60, reader=0x7fffd815e970, end_pos=0x7fffd815e440) at /newdata/mysql-8.0.23/sql/rpl_binlog_sender.cc:557#9 0x0000000004488201 in Binlog_sender::send_binlog (this=0x7fffd815eb60, reader=0x7fffd815e970, start_pos=125) at /newdata/mysql-8.0.23/sql/rpl_binlog_sender.cc:511#10 0x0000000004487b95 in Binlog_sender::run (this=0x7fffd815eb60) at /newdata/mysql-8.0.23/sql/rpl_binlog_sender.cc:412#11 0x0000000004483604 in mysql_binlog_send (thd=0x7fff20015540, log_ident=0x7fffd815f700 "", pos=4, slave_gtid_executed=0x7fffd815f540, flags=2) at /newdata/mysql-8.0.23/sql/rpl_master.cc:1017#12 0x00000000044834c9 in com_binlog_dump_gtid (thd=0x7fff20015540, packet=0x7fff20006f71 "", packet_length=70) at /newdata/mysql-8.0.23/sql/rpl_master.cc:1002#13 0x000000000323dcc9 in dispatch_command (thd=0x7fff20015540, com_data=0x7fffd8160b40, command=COM_BINLOG_DUMP_GTID) at /newdata/mysql-8.0.23/sql/sql_parse.cc:2048#14 0x000000000323b513 in do_command (thd=0x7fff20015540) at /newdata/mysql-8.0.23/sql/sql_parse.cc:1320#15 0x000000000340dfb1 in handle_connection (arg=0xa2d4880) at /newdata/mysql-8.0.23/sql/conn_handler/connection_handler_per_thread.cc:301#16 0x000000000500dc16 in pfs_spawn_thread (arg=0xa524270) at /newdata/mysql-8.0.23/storage/perfschema/pfs.cc:2900#17 0x00007ffff7bc6ea5 in start_thread () from /lib64/libpthread.so.0#18 0x00007ffff61418dd in clone () from /lib64/libc.so.6(gdb) f 5#5 0x00000000043cb6a6 in MYSQL_BIN_LOG::wait_for_update (this=0x7e0eb60 <mysql_bin_log>, timeout=0x7fffd815e280) at /newdata/mysql-8.0.23/sql/binlog.cc:75577557 ret = mysql_cond_timedwait(&update_cond, &LOCK_binlog_end_pos,(gdb) p timeout
$3 = (const timespec *) 0x7fffd815e280(gdb) p *timeout
$4 = {tv_sec = 1729517575, tv_nsec = 413257724}(gdb) f 6#6 0x000000000448acaf in Binlog_sender::wait_with_heartbeat (this=0x7fffd815eb60, log_pos=316133) at /newdata/mysql-8.0.23/sql/rpl_binlog_sender.cc:796796 ret = mysql_bin_log.wait_for_update(&ts);(gdb) p ts
$5 = {tv_sec = 1729517575, tv_nsec = 413257724}(gdb) p m_heartbeat_period.count()$6 = 30000001024
编辑推荐:
- 主从半同步降级异步分析03-01
- 一个古老的死锁BUG终于修复了03-01
- mysql面试03-01
- 如何让SQL速度飞起来 入门YashanDB优化器03-01
- 数智化转型 | 星环科技Defensor 助力某银行数据分类分级03-01
- 刚刚!MySQL创新版9.2.0发布,学爆它!03-01
- 【SQLAOT】一个用来辅助SQL优化工作的工具03-01
- 开发一个集婚恋相亲、交友、多人语音互动于一体的APP系统03-01
相关推荐
-
雷神推出 MIX PRO II 迷你主机:基于 Ultra 200H,玻璃上盖 + ARGB 灯效
2 月 9 日消息,雷神 (THUNDEROBOT) 现已宣布推出基于英
-
制造商 Musnap 推出彩色墨水屏电纸书 Ocean C:支持手写笔、第三方安卓应用
2 月 10 日消息,制造商 Musnap 现已在海外推出一款 Oce
热文推荐
- 主从半同步降级异步分析
主从半同步降级异步分析
26-03-01 - mysql面试
mysql面试
26-03-01 - 如何让SQL速度飞起来 入门YashanDB优化器
如何让SQL速度飞起来 入门YashanDB优化器
26-03-01 - 数智化转型 | 星环科技Defensor 助力某银行数据分类分级
数智化转型 | 星环科技Defensor 助力某银行数据分类分级
26-03-01 - 刚刚!MySQL创新版9.2.0发布,学爆它!
刚刚!MySQL创新版9.2.0发布,学爆它!
26-03-01 - 【SQLAOT】一个用来辅助SQL优化工作的工具
【SQLAOT】一个用来辅助SQL优化工作的工具
26-03-01 - 开发一个集婚恋相亲、交友、多人语音互动于一体的APP系统
开发一个集婚恋相亲、交友、多人语音互动于一体的APP系统
26-03-01 - 第26期 MySQL区分大小写
第26期 MySQL区分大小写
26-03-01 - 数据库数据恢复——MySQL数据库Delete命令误删除数据如何恢复数据?
- 充电桩亿级数据处理演进实战
充电桩亿级数据处理演进实战
26-03-01
