背景概述
客户在测试时发现执行某些DML语句时,出现了异常情况,报表不存在或者列不匹配的情况;
我在做数据迁移测试的时候也出现此问题,迁移数据时报 unknow column;
看到这种情况的时候很奇怪,查看表结构时也能看到当前执行的SQL语句涉及的表及列是存在的;
经过排查,最终发现当前这张表涉及触发器,报错的也不是这张表,而是其他表。
问题复现
本次测试基于 GreatSQL 8.0.32
1.创建测试表
greatsql> CREATE TABLE t1 (c1 int,c2 int,c3 int,c4 int);
greatsql> INSERT INTO t1 VALUES (1,1,1,1),(2,2,2,2),(3,3,3,3),(4,4,4,4);
greatsql> CREATE TABLE t2 (c5 int,c6 int,c7 int,c8 int);
greatsql> INSERT INTO t2 VALUES (1,1,1,1),(2,2,2,2),(3,3,3,3),(4,4,4,4);
2.创建触发器
# t2表不存在c1列
greatsql> CREATE TRIGGER test1
after INSERT on t1
FOR EACH ROW
INSERT INTO t2(c1) values(NEW.c1);
Query OK, 0 rows affected (0.02 sec)
greatsql> CREATE TRIGGER test2
after UPDATE on t1
FOR EACH ROW
UPDATE test.t2 SET c1=(NEW.c1)+1 WHERE c1=(NEW.c1);
Query OK, 0 rows affected (0.02 sec)
greatsql> CREATE TRIGGER test3
after DELETE on t1
FOR EACH ROW
DELETE FROM t2 WHERE c1=(OLD.c1);
Query OK, 0 rows affected (0.02 sec)
# t3表不存在
greatsql> CREATE TRIGGER test4
before UPDATE on t2
FOR EACH ROW
INSERT INTO t3(c1) values(NEW.c5);
Query OK, 0 rows affected (0.00 sec)
可以看到在创建触发器的时候,不会去判断语句中涉及的表或者列是否存在。
3.执行测试SQL
greatsql> INSERT INTO test.t1 values (1,1,1,1);
ERROR 1054 (42S22): Unknown column 'c1' in 'field list'
greatsql> UPDATE test.t1 SET c1=110 WHERE c1=1;
ERROR 1054 (42S22): Unknown column 'c1' in 'field list'
greatsql> DELETE FROM test.t1 WHERE c1=1;
ERROR 1054 (42S22): Unknown column 'c1' in 'where clause'
greatsql> UPDATE t2 SET c5=110 WHERE c5=1;
ERROR 1146 (42S02): Table 'test.t3' doesn't exist
此时报错c1列不存在,但没有显示是具体那张表的c1列,因此对我们产生误导,明明t1表存在c1列,但是还是报错c1列不存在;
4.故障排查
遇到上述问题时,我们可以打开通用日志,观察一下日志中记录的语句
shell> tail -f general5000.log
...
2024-10-14T16:21:16.837007+08:00 2651 Query INSERT INTO test.t1 values (1,1,1,1)
2024-10-14T16:21:16.839500+08:00 2651 Query INSERT INTO t2(c1) values(NEW.c1)
...
可以看到当我们执行了 INSERT INTO test.t1 语句后紧接着自动执行 INSERT INTO t2(c1) 语句,因为t2表没有c1列,所以报错 Unknown column 'c1'。
5.查看当前表涉及的触发器
greatsql> SELECT TRIGGER_SCHEMA,TRIGGER_NAME,EVENT_OBJECT_SCHEMA,EVENT_OBJECT_TABLE,ACTION_STATEMENT FROM INFORMATION_SCHEMA.TRIGGERS WHERE EVENT_OBJECT_TABLE='t1';
+----------------+--------------+---------------------+--------------------+----------------------------------------------------+
| TRIGGER_SCHEMA | TRIGGER_NAME | EVENT_OBJECT_SCHEMA | EVENT_OBJECT_TABLE | ACTION_STATEMENT |
+----------------+--------------+---------------------+--------------------+----------------------------------------------------+
| test | test1 | test | t1 | INSERT INTO t2(c1) values(NEW.c1) |
| test | test2 | test | t1 | UPDATE test.t2 SET c1=(NEW.c1)+1 WHERE c1=(NEW.c1) |
| test | test3 | test | t1 | DELETE FROM t2 WHERE c1=(OLD.c1) |
+----------------+--------------+---------------------+--------------------+----------------------------------------------------+
3 rows in set (0.00 sec)
当出现上面的问题时,可以查看一下这张表是否涉及触发器;如果涉及则检查一下对应触发器的ACTION_STATEMENT字段中的SQL语句涉及的表是否包含报错的字段。
总结
如果出现在执行DML操作时报错,并且报错跟当前表没有什么关系时可以考虑是否有触发器与当前表有关联,检查一下触发器中涉及的SQL语句。