背景概述
客户反映,业务上某张表的 insert 操作速度很慢,单条 insert 语句的最大执行时间超过了 5 秒。在收到客户问题后,我们仔细检查了数据库状态以及主机的负载情况,发现目前一切正常,并没有发现数据库故障或主机负载过高导致 insert 操作变慢的问题。
因此,我们分析了慢日志,希望从中找出问题。经过分析,发现这条插入语句的query_time和lock_time几乎相同,因此怀疑是由于锁等待导致插入操作变慢。随后,我们捕获了通用日志,几乎同一时间这张表有update,insert操作,发现由于更新操作阻塞了插入操作,导致插入速度下降的问题。这个更新操作所在的事务包含了多条 SQL 语句,因此如果该事务执行时间较长,就会阻塞插入操作,导致插入操作的执行时间延长。
问题复现
本次测试基于 GreatSQL 8.0.32-25,隔离级别为 RR
2.1 创建测试表
greatsql> CREATE TABLE `t11` (
`id` int NOT NULL,
`c1` int DEFAULT NULL,
`c2` int DEFAULT NULL,
`c3` int DEFAULT NULL,
`c4` int DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `c2` (`c2`,`c3`),
KEY `c4` (`c4`)
);
greatsql> insert into t11 values (1,1,1,1,1),(2,2,2,2,2),(3,3,3,3,3),(5,5,5,5,5);
2.2 事务执行顺序
时间 | 事务1 | 事务2 |
T1 | BEGIN; | BEGIN; |
T2 | update t10 set c2=20 where c4=2; | |
T3 | insert into t10 values (6,2,2,2,2); | |
T4 | -- hang住,处于锁等待 | |
T5 | commit; | -- 锁等待结束 |
T6 | commit; |
2.3 事务1执行
greatsql> begin;
greatsql> update t11 set c2=20 where c4=2;
查看加锁情况:
greatsql> select THREAD_ID,EVENT_ID,ENGINE_LOCK_ID,OBJECT_SCHEMA,OBJECT_NAME,INDEX_NAME,LOCK_TYPE,LOCK_MODE,LOCK_STATUS,LOCK_DATA from performance_schema.data_locks;
+-----------+----------+-------------------------------------------+---------------+-------------+------------+-----------+---------------+-------------+-----------+
| THREAD_ID | EVENT_ID | ENGINE_LOCK_ID | OBJECT_SCHEMA | OBJECT_NAME | INDEX_NAME | LOCK_TYPE | LOCK_MODE | LOCK_STATUS | LOCK_DATA |
+-----------+----------+-------------------------------------------+---------------+-------------+------------+-----------+---------------+-------------+-----------+
| 55 | 20 | 140531661278568:44172:140531678523168 | test | t11 | NULL | TABLE | IX | GRANTED | NULL |
| 55 | 20 | 140531661278568:43110:6:3:140531678129184 | test | t11 | c4 | RECORD | X | GRANTED | 2, 2 |
| 55 | 20 | 140531661278568:43110:4:3:140531678129528 | test | t11 | PRIMARY | RECORD | X,REC_NOT_GAP | GRANTED | 2 |
| 55 | 20 | 140531661278568:43110:6:4:140531678129872 | test | t11 | c4 | RECORD | X,GAP | GRANTED | 3, 3 |
+-----------+----------+-------------------------------------------+---------------+-------------+------------+-----------+---------------+-------------+-----------+
4 rows in set (0.01 sec)
可以看到此时给【3, 3】这条数据加加了X,GAP锁
2.4 事务2执行
greatsql> begin;
greatsql> insert into t11 values (6,2,2,2,2);
查看加锁情况:
greatsql> select THREAD_ID,EVENT_ID,ENGINE_LOCK_ID,OBJECT_SCHEMA,OBJECT_NAME,INDEX_NAME,LOCK_TYPE,LOCK_MODE,LOCK_STATUS,LOCK_DATA from performance_schema.data_locks;
+-----------+----------+-------------------------------------------+---------------+-------------+------------+-----------+------------------------+-------------+-----------+
| THREAD_ID | EVENT_ID | ENGINE_LOCK_ID | OBJECT_SCHEMA | OBJECT_NAME | INDEX_NAME | LOCK_TYPE | LOCK_MODE | LOCK_STATUS | LOCK_DATA |
+-----------+----------+-------------------------------------------+---------------+-------------+------------+-----------+------------------------+-------------+-----------+
| 56 | 14 | 140531661279416:44172:140531678523936 | test | t11 | NULL | TABLE | IX | GRANTED | NULL |
| 56 | 14 | 140531661279416:43110:6:4:140531678132256 | test | t11 | c4 | RECORD | X,GAP,INSERT_INTENTION | WAITING | 3, 3 |
| 55 | 20 | 140531661278568:44172:140531678523168 | test | t11 | NULL | TABLE | IX | GRANTED | NULL |
| 55 | 20 | 140531661278568:43110:6:3:140531678129184 | test | t11 | c4 | RECORD | X | GRANTED | 2, 2 |
| 55 | 20 | 140531661278568:43110:4:3:140531678129528 | test | t11 | PRIMARY | RECORD | X,REC_NOT_GAP | GRANTED | 2 |
| 55 | 20 | 140531661278568:43110:6:4:140531678129872 | test | t11 | c4 | RECORD | X,GAP | GRANTED | 3, 3 |
+-----------+----------+-------------------------------------------+---------------+-------------+------------+-----------+------------------------+-------------+-----------+
6 rows in set (0.00 sec)
greatsql> select REQUESTING_THREAD_ID,REQUESTING_EVENT_ID,REQUESTING_ENGINE_LOCK_ID,BLOCKING_THREAD_ID,BLOCKING_EVENT_ID,BLOCKING_ENGINE_LOCK_ID from performance_schema.data_lock_waits;
+----------------------+---------------------+-------------------------------------------+--------------------+-------------------+-------------------------------------------+
| REQUESTING_THREAD_ID | REQUESTING_EVENT_ID | REQUESTING_ENGINE_LOCK_ID | BLOCKING_THREAD_ID | BLOCKING_EVENT_ID | BLOCKING_ENGINE_LOCK_ID |
+----------------------+---------------------+-------------------------------------------+--------------------+-------------------+-------------------------------------------+
| 56 | 14 | 140531661279416:43110:6:4:140531678132256 | 55 | 20 | 140531661278568:43110:6:4:140531678129872 |
+----------------------+---------------------+-------------------------------------------+--------------------+-------------------+-------------------------------------------+
1 row in set (0.00 sec)
通过上面2张表,可以看到 X,GAP锁 阻塞了 X,GAP,INSERT_INTENTION 锁;
2.5 结论
此次Insert慢的原因就是Update语句所在的事务执行时间较长,Update语句产生了GAP锁;
Insert语句在执行时此Update语句所在事务还没有执行完成,因此Insert处于锁等待阶段,待Update所在事务提交后Insert才提交;
总结
导致此次问题的原因是 GAP 锁阻塞了 INSERT_INTENTION 锁;因此建议客户在执行 update 操作时,where 条件用主键列,这样可以避免加 GAP 锁。