来源:DBA烂笔头
数据库版本:5.7.21,遇见一个数据同步的bug,主库调用一个函数操作dml,导致主从异常中断。
复现如下:
1、建表插入数据CREATE TABLE `s1` (int(11) NOT NULL DEFAULT '0' COMMENT '',int(11) NOT NULL COMMENT ''ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=DYNAMIC;insert into s1 values(1,999999);2、创建函数DELIMITER $$CREATE DEFINER=`root`@`%` FUNCTION `nextTrxSerno`() RETURNS varchar(30) CHARSET utf8BEGINDECLARE v_id INT (11) ;DECLARE v_max INT (11) ;DECLARE EXIT HANDLER FOR SQLEXCEPTION RETURN '' ;UPDATEs1 s SET= s.id + 1 ;SELECTs.id,INTO v_id,v_maxFROMs1 s ;IF v_id >= v_max - 10 THEN SET v_id = 1 ; END IF ;UPDATEs1 s SET= v_id + 1 ;RETURN CONCAT(DATE_FORMAT(SYSDATE(), '%Y%m%d%H%i%S'), LPAD(v_id, 8, '0')) ;END $$DELIMITER ;3、调用函数select nextTrxSerno();4、查看结果:主库:select * from s1;+----+--------+id | max |+----+--------+3 | 999999 |+----+--------+1 row in set (0.00 sec)备库:select * from s1;+----+--------+id | max |+----+--------+1 | 999999 |+----+--------+1 row in set (0.00 sec)备库:show slave status\G;1. row ***************************Slave_IO_State: Waiting for master to send eventMaster_Host: 172.17.134.50Master_User: replMaster_Port: 3305Connect_Retry: 60Master_Log_File: mysql_bin.000305Read_Master_Log_Pos: 8857Relay_Log_File: relay.000002Relay_Log_Pos: 6992Relay_Master_Log_File: mysql_bin.000305Slave_IO_Running: YesSlave_SQL_Running: NoReplicate_Do_DB:Replicate_Ignore_DB:Replicate_Do_Table:Replicate_Ignore_Table:Replicate_Wild_Do_Table:Replicate_Wild_Ignore_Table: sys.%,performance_schema.%,information_schema.%Last_Errno: 1032Last_Error: Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 1 failed executing transaction 'b034d235-6773-11ee-afe7-00163e04944a:3984450' at master log mysql_bin.000305, end_log_pos 8826. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any.Skip_Counter: 0Exec_Master_Log_Pos: 8520Relay_Log_Space: 7526Until_Condition: NoneUntil_Log_File:Until_Log_Pos: 0Master_SSL_Allowed: NoMaster_SSL_CA_File:Master_SSL_CA_Path:Master_SSL_Cert:Master_SSL_Cipher:Master_SSL_Key:Seconds_Behind_Master: NULLMaster_SSL_Verify_Server_Cert: NoLast_IO_Errno: 0Last_IO_Error:Last_SQL_Errno: 1032Last_SQL_Error: Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 1 failed executing transaction 'b034d235-6773-11ee-afe7-00163e04944a:3984450' at master log mysql_bin.000305, end_log_pos 8826. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any.Replicate_Ignore_Server_Ids:Master_Server_Id: 16708Master_UUID: b034d235-6773-11ee-afe7-00163e04944aMaster_Info_File: mysql.slave_master_infoSQL_Delay: 0SQL_Remaining_Delay: NULLSlave_SQL_Running_State:Master_Retry_Count: 86400Master_Bind:Last_IO_Error_Timestamp:Last_SQL_Error_Timestamp: 231027 21:13:43Master_SSL_Crl:Master_SSL_Crlpath:Retrieved_Gtid_Set: b034d235-6773-11ee-afe7-00163e04944a:3984428-3984450Executed_Gtid_Set: b034d235-6773-11ee-afe7-00163e04944a:1-3984449Auto_Position: 1Replicate_Rewrite_DB:Channel_Name:Master_TLS_Version:1 row in set (0.00 sec)ERROR:No query specified
经分析,这是一个 MySQL bug。 见:https://bugs.mysql.com/bug.php?id=101828

报错原理:
假设没有主键和唯一索引的表t1有{r[1],r[2],r[3],...r[x],...,r[y],...},当crc32(r[x])=crc32(r[y])时满足,对r[y]的更新将应用于从机上的r[x]。这会导致主机和从机的r[x]和r[y]值不一致。然后在主服务器上,执行删除/更新r[x]或r[y]将导致HA_ERR_END_OF_FILE(1032)错误。
(重要)规避方法: 没有主键的表添加主键 (函数操作的表没有主键)
规避演示:(见截图)
主库:
从库:
在MySQL 8.0.25中测试无该bug,尚不清楚该bug在官方什么版本被修复的。如果有知道的也可以告知下。
