介绍
wal2json 是一个用于逻辑解码的输出插件。这意味着插件可以访问 INSERT 和 UPDATE 生成的元组。此外,还可以根据配置的 REPLICA IDENTITY 访问 UPDATE/DELETE 的旧行版本。可以使用流复制协议(逻辑复制槽)或特殊的 SQL API 来使用更改。
格式版本 1 为每个事务生成一个 JSON 对象。所有新/旧元组都会出现在 JSON 对象中。此外,还有一些选项可以控制包含事务时间戳、模式限定、数据类型和事务 ID 等属性。
格式版本 2 为每个元组生成一个 JSON 对象。事务开始和结束可选择生成 JSON 对象。此外,还有多种选项可以控制包含属性。
配置
postgresql.conf
您需要在 postgresql.conf 中设置wal_level参数:
wal_level = logical
更改此参数后,需要重新启动。
参数
• include-xids:将 xid 添加到每个变更集。默认值为 false。
• include-timestamp:为每个变更集添加时间戳。默认值为 false。
• include-schemas:为每个更改添加模式。默认值为 true。
• include-types:为每个更改添加类型。默认值为 true。
• include-typmod:将修饰符添加到具有修饰符的类型(例如 varchar(20) 而不是 varchar)。默认值为 true。
• include-type-oids:添加类型 OID。默认值为 false。
• include-domain-data-type:将域类型名替换为底层数据类型。默认值为 false。
• include-column-positions:添加列位置(pg_attribute.attnum)。默认值为 false。
• include-origin:添加一段数据以包含来源。默认值为 false。
• include-not-null:以 column optionals 形式添加 not null 信息。默认值为 false。
• include-default:添加默认值表达式。默认值为 false。
• include-pk:以 pk 形式添加主键信息。包含列名和数据类型。默认值为 false。
• numeric-data-types-as-string:对 numeric 数据类型使用字符串。JSON 规范不能将Infinity和NaN识别为有效的数值。对于双精度浮点数,可能会存在潜在的互操作性问题。默认值为 false。
• pretty-print:向 JSON 结构添加空格和缩进。默认值为 false。
• write-in-chunks:在每个更改之后写入,而不是在每个更改集之后写入。仅当format-version是1时才使用。默认值为 false。
• include-lsn:将 nextlsn 添加到每个变更集。默认值为 false。
• include-transaction:发出用来表示每个事务的开始和结束的记录。默认值为 true。
• include-unchanged-toast(已弃用):不要使用它。它已弃用。
• filter-origins:排除指定来源的更改。默认值为空,这意味着不会过滤任何来源。它是一个用逗号分隔的值。
• filter-tables:排除指定表中的行。默认值为空,这意味着不会筛选任何表。它是一个逗号分隔的值。这些表应该是模式限定的。*.foo表示所有模式中的 foo 表,bar.*表示模式 bar 中的所有表。特殊字符(空格、单引号、逗号、句点、星号)必须用反斜杠进行转义。模式和表区分大小写。表"public"."Foo bar"应指定为public.Foo\ bar。
• add-tables:仅包含指定表中的行。默认值为所有模式中的所有表。它与filter-tables中的规则相同。
• filter-msg-prefixes:如果消息前缀在列表中,则排除消息。默认值为空,这意味着不会过滤任何消息。它是一个逗号分隔的值。
• add-msg-prefixes:仅包括消息前缀在列表中的消息。默认为所有前缀。它是一个逗号分隔的值。wal2json会在此参数之前先应用filter-msg-prefixes。
• format-version:定义要使用的格式。默认值为 1。
• actions:定义发送哪些操作。默认为所有操作(插入、更新、删除和截断)。但是,如果您使用的format-version是 1,则不会启用 truncate(向后兼容性)。
示例
有两种方法可以从 wal2json 插件获取更改(JSON 对象):通过 SQL 调用函数,或者使用 pg_recvlogical。
pg_recvlogical
除了上面的配置之外,还需要配置复制连接以使用 pg_recvlogical。版本 9.4、9.5 和 9.6 中的逻辑复制连接,要求数据库列中有replication关键字。从版本 10 开始,逻辑复制可以匹配带数据库名称或关键字(如all)的普通条目。
首先,在 pg_hba.conf(9.4、9.5 和 9.6)添加一条复制连接规则:
local replication myuser trust
如果您使用的是版本 10 或更高版本:
local mydatabase myuser trust
另外,在 postgresql.conf 中设置 max_wal_senders:
max_wal_senders = 1
如果更改了 max_wal_senders,则需要重新启动。
现在,您已准备好试用 wal2json 了。可以在一个终端中执行:
$ pg_recvlogical -d postgres --slot test_slot --create-slot -P wal2json
$ pg_recvlogical -d postgres --slot test_slot --start -o pretty-print=1 -o add-msg-prefixes=wal2json -f -
在另一个终端中:
$ cat /tmp/example1.sql
CREATE TABLE table1_with_pk (a SERIAL, b VARCHAR(30), c TIMESTAMP NOT NULL, PRIMARY KEY(a, c));
CREATE TABLE table1_without_pk (a SERIAL, b NUMERIC(5,2), c TEXT);
BEGIN;
INSERT INTO table1_with_pk (b, c) VALUES('Backup and Restore', now());
INSERT INTO table1_with_pk (b, c) VALUES('Tuning', now());
INSERT INTO table1_with_pk (b, c) VALUES('Replication', now());
SELECT pg_logical_emit_message(true, 'wal2json', 'this message will be delivered');
SELECT pg_logical_emit_message(true, 'pgoutput', 'this message will be filtered');
DELETE FROM table1_with_pk WHERE a < 3;
SELECT pg_logical_emit_message(false, 'wal2json', 'this non-transactional message will be delivered even if you rollback the transaction');
INSERT INTO table1_without_pk (b, c) VALUES(2.34, 'Tapir');
-- it is not added to stream because there isn't a pk or a replica identity
UPDATE table1_without_pk SET c = 'Anta' WHERE c = 'Tapir';
COMMIT;
DROP TABLE table1_with_pk;
DROP TABLE table1_without_pk;
上面的脚本产生了以下输出:
$ psql -At -f /tmp/example1.sql postgres
CREATE TABLE
CREATE TABLE
BEGIN
INSERT 0 1
INSERT 0 1
INSERT 0 1
3/78BFC828
3/78BFC880
DELETE 2
3/78BFC990
INSERT 0 1
UPDATE 1
COMMIT
DROP TABLE
DROP TABLE
第一个终端的输出为:
{
"change": [
]
}
{
"change": [
]
}
{
"change": [
{
"kind": "message",
"transactional": false,
"prefix": "wal2json",
"content": "this non-transactional message will be delivered even if you rollback the transaction"
}
]
}
WARNING: table "table1_without_pk" without primary key or replica identity is nothing
{
"change": [
{
"kind": "insert",
"schema": "public",
"table": "table1_with_pk",
"columnnames": ["a", "b", "c"],
"columntypes": ["integer", "character varying(30)", "timestamp without time zone"],
"columnvalues": [1, "Backup and Restore", "2018-03-27 11:58:28.988414"]
}
,{
"kind": "insert",
"schema": "public",
"table": "table1_with_pk",
"columnnames": ["a", "b", "c"],
"columntypes": ["integer", "character varying(30)", "timestamp without time zone"],
"columnvalues": [2, "Tuning", "2018-03-27 11:58:28.988414"]
}
,{
"kind": "insert",
"schema": "public",
"table": "table1_with_pk",
"columnnames": ["a", "b", "c"],
"columntypes": ["integer", "character varying(30)", "timestamp without time zone"],
"columnvalues": [3, "Replication", "2018-03-27 11:58:28.988414"]
}
,{
"kind": "message",
"transactional": true,
"prefix": "wal2json",
"content": "this message will be delivered"
}
,{
"kind": "delete",
"schema": "public",
"table": "table1_with_pk",
"oldkeys": {
"keynames": ["a", "c"],
"keytypes": ["integer", "timestamp without time zone"],
"keyvalues": [1, "2018-03-27 11:58:28.988414"]
}
}
,{
"kind": "delete",
"schema": "public",
"table": "table1_with_pk",
"oldkeys": {
"keynames": ["a", "c"],
"keytypes": ["integer", "timestamp without time zone"],
"keyvalues": [2, "2018-03-27 11:58:28.988414"]
}
}
,{
"kind": "insert",
"schema": "public",
"table": "table1_without_pk",
"columnnames": ["a", "b", "c"],
"columntypes": ["integer", "numeric(5,2)", "text"],
"columnvalues": [1, 2.34, "Tapir"]
}
]
}
{
"change": [
]
}
{
"change": [
]
}
在第一个终端中删除复制槽:
Ctrl+C
$ pg_recvlogical -d postgres --slot test_slot --drop-slot
SQL 函数
$ cat /tmp/example2.sql
CREATE TABLE table2_with_pk (a SERIAL, b VARCHAR(30), c TIMESTAMP NOT NULL, PRIMARY KEY(a, c));
CREATE TABLE table2_without_pk (a SERIAL, b NUMERIC(5,2), c TEXT);
SELECT 'init' FROM pg_create_logical_replication_slot('test_slot', 'wal2json');
BEGIN;
INSERT INTO table2_with_pk (b, c) VALUES('Backup and Restore', now());
INSERT INTO table2_with_pk (b, c) VALUES('Tuning', now());
INSERT INTO table2_with_pk (b, c) VALUES('Replication', now());
SELECT pg_logical_emit_message(true, 'wal2json', 'this message will be delivered');
SELECT pg_logical_emit_message(true, 'pgoutput', 'this message will be filtered');
DELETE FROM table2_with_pk WHERE a < 3;
SELECT pg_logical_emit_message(false, 'wal2json', 'this non-transactional message will be delivered even if you rollback the transaction');
INSERT INTO table2_without_pk (b, c) VALUES(2.34, 'Tapir');
-- it is not added to stream because there isn't a pk or a replica identity
UPDATE table2_without_pk SET c = 'Anta' WHERE c = 'Tapir';
COMMIT;
SELECT data FROM pg_logical_slot_get_changes('test_slot', NULL, NULL, 'pretty-print', '1', 'add-msg-prefixes', 'wal2json');
SELECT 'stop' FROM pg_drop_replication_slot('test_slot');
DROP TABLE table2_with_pk;
DROP TABLE table2_without_pk;
上面的脚本产生了以下输出:
$ psql -At -f /tmp/example2.sql postgres
CREATE TABLE
CREATE TABLE
init
BEGIN
INSERT 0 1
INSERT 0 1
INSERT 0 1
3/78C2CA50
3/78C2CAA8
DELETE 2
3/78C2CBD8
INSERT 0 1
UPDATE 1
COMMIT
{
"change": [
{
"kind": "message",
"transactional": false,
"prefix": "wal2json",
"content": "this non-transactional message will be delivered even if you rollback the transaction"
}
]
}
psql:/tmp/example2.sql:17: WARNING: table "table2_without_pk" without primary key or replica identity is nothing
{
"change": [
{
"kind": "insert",
"schema": "public",
"table": "table2_with_pk",
"columnnames": ["a", "b", "c"],
"columntypes": ["integer", "character varying(30)", "timestamp without time zone"],
"columnvalues": [1, "Backup and Restore", "2018-03-27 12:05:29.914496"]
}
,{
"kind": "insert",
"schema": "public",
"table": "table2_with_pk",
"columnnames": ["a", "b", "c"],
"columntypes": ["integer", "character varying(30)", "timestamp without time zone"],
"columnvalues": [2, "Tuning", "2018-03-27 12:05:29.914496"]
}
,{
"kind": "insert",
"schema": "public",
"table": "table2_with_pk",
"columnnames": ["a", "b", "c"],
"columntypes": ["integer", "character varying(30)", "timestamp without time zone"],
"columnvalues": [3, "Replication", "2018-03-27 12:05:29.914496"]
}
,{
"kind": "message",
"transactional": true,
"prefix": "wal2json",
"content": "this message will be delivered"
}
,{
"kind": "delete",
"schema": "public",
"table": "table2_with_pk",
"oldkeys": {
"keynames": ["a", "c"],
"keytypes": ["integer", "timestamp without time zone"],
"keyvalues": [1, "2018-03-27 12:05:29.914496"]
}
}
,{
"kind": "delete",
"schema": "public",
"table": "table2_with_pk",
"oldkeys": {
"keynames": ["a", "c"],
"keytypes": ["integer", "timestamp without time zone"],
"keyvalues": [2, "2018-03-27 12:05:29.914496"]
}
}
,{
"kind": "insert",
"schema": "public",
"table": "table2_without_pk",
"columnnames": ["a", "b", "c"],
"columntypes": ["integer", "numeric(5,2)", "text"],
"columnvalues": [1, 2.34, "Tapir"]
}
]
}
stop
DROP TABLE
DROP TABLE
让我们采用format-version为 2 来重复同样的例子:
$ cat /tmp/example3.sql
CREATE TABLE table3_with_pk (a SERIAL, b VARCHAR(30), c TIMESTAMP NOT NULL, PRIMARY KEY(a, c));
CREATE TABLE table3_without_pk (a SERIAL, b NUMERIC(5,2), c TEXT);
SELECT 'init' FROM pg_create_logical_replication_slot('test_slot', 'wal2json');
BEGIN;
INSERT INTO table3_with_pk (b, c) VALUES('Backup and Restore', now());
INSERT INTO table3_with_pk (b, c) VALUES('Tuning', now());
INSERT INTO table3_with_pk (b, c) VALUES('Replication', now());
SELECT pg_logical_emit_message(true, 'wal2json', 'this message will be delivered');
SELECT pg_logical_emit_message(true, 'pgoutput', 'this message will be filtered');
DELETE FROM table3_with_pk WHERE a < 3;
SELECT pg_logical_emit_message(false, 'wal2json', 'this non-transactional message will be delivered even if you rollback the transaction');
INSERT INTO table3_without_pk (b, c) VALUES(2.34, 'Tapir');
-- it is not added to stream because there isn't a pk or a replica identity
UPDATE table3_without_pk SET c = 'Anta' WHERE c = 'Tapir';
COMMIT;
SELECT data FROM pg_logical_slot_get_changes('test_slot', NULL, NULL, 'format-version', '2', 'add-msg-prefixes', 'wal2json');
SELECT 'stop' FROM pg_drop_replication_slot('test_slot');
DROP TABLE table3_with_pk;
DROP TABLE table3_without_pk;
上面的脚本产生了以下输出:
$ psql -At -f /tmp/example3.sql postgres
CREATE TABLE
CREATE TABLE
init
BEGIN
INSERT 0 1
INSERT 0 1
INSERT 0 1
3/78CB8F30
3/78CB8F88
DELETE 2
3/78CB90B8
INSERT 0 1
UPDATE 1
COMMIT
psql:/tmp/example3.sql:20: WARNING: no tuple identifier for UPDATE in table "public"."table3_without_pk"
{"action":"M","transactional":false,"prefix":"wal2json","content":"this non-transactional message will be delivered even if you rollback the transaction"}
{"action":"B"}
{"action":"I","schema":"public","table":"table3_with_pk","columns":[{"name":"a","type":"integer","value":1},{"name":"b","type":"character varying(30)","value":"Backup and Restore"},{"name":"c","type":"timestamp without time zone","value":"2019-12-29 04:58:34.806671"}]}
{"action":"I","schema":"public","table":"table3_with_pk","columns":[{"name":"a","type":"integer","value":2},{"name":"b","type":"character varying(30)","value":"Tuning"},{"name":"c","type":"timestamp without time zone","value":"2019-12-29 04:58:34.806671"}]}
{"action":"I","schema":"public","table":"table3_with_pk","columns":[{"name":"a","type":"integer","value":3},{"name":"b","type":"character varying(30)","value":"Replication"},{"name":"c","type":"timestamp without time zone","value":"2019-12-29 04:58:34.806671"}]}
{"action":"M","transactional":true,"prefix":"wal2json","content":"this message will be delivered"}
{"action":"D","schema":"public","table":"table3_with_pk","identity":[{"name":"a","type":"integer","value":1},{"name":"c","type":"timestamp without time zone","value":"2019-12-29 04:58:34.806671"}]}
{"action":"D","schema":"public","table":"table3_with_pk","identity":[{"name":"a","type":"integer","value":2},{"name":"c","type":"timestamp without time zone","value":"2019-12-29 04:58:34.806671"}]}
{"action":"I","schema":"public","table":"table3_without_pk","columns":[{"name":"a","type":"integer","value":1},{"name":"b","type":"numeric(5,2)","value":2.34},{"name":"c","type":"text","value":"Tapir"}]}
{"action":"C"}
stop
DROP TABLE
DROP TABLE