背景:
项目需要通过zabbix-proxy 来获取监控数据【网络是单向的,zabbix-proxy 通过公网主动连接 zabbix-server ,上报监控数据,已达到监控的目的】
需求:
因为是单向网络,proxy 挂了后,zabbix master 都无法获取到 zabbix-proxy 和 下面的agent 机器的离线的状态,更不会触发告警;现需要对 zabbix-proxy 的状态实行监控,利用 zabbix-proxy 主动上传到 zabbix 页面上的 agent 代理程序的 状态 来判断 zabbix-proxy 的存活
思路:
通过获取 zabbix api 中 Proxy 代理 的 信息 来实现 监控
官方api说明:
版本:zabbix4.0
- 获取token:https://www.zabbix.com/documentation/4.0/zh/manual/api/reference/user/login
- 获取proxy 代理信息: https://www.zabbix.com/documentation/4.0/zh/manual/api/reference/proxy/get
#获取token:
- #入参:
- curl -s -X POST -H 'Content-Type: application/json' -d '
- {
- "jsonrpc": "2.0",
- "method": "user.login",
- "params": {
- "user": "Admin",
- "password": "PASSWORD"
- },
- "id": 1
- }' http://172.16.10.37:8888/api_jsonrpc.php;
- #回参:
- {"jsonrpc":"2.0","result":"为0qwewerwrsdfdsfdsafsd","id":1}
- #得到token 为0qwewerwrsdfdsfdsafsd
#获取proxy 代理信息
- #利用上面获取的token,来获取api 中proxy 的代理信息
- #入参:
- curl -s -X POST -H 'Content-Type: application/json' -d '
- {
- "jsonrpc": "2.0",
- "method": "proxy.get",
- "params": {
- "output": "extend",
- "selectInterface": "extend"
- },
- "auth": "0qwewerwrsdfdsfdsafsd",
- "id": 1
- }' http://172.16.10.37:8888/api_jsonrpc.php
- #回参
- {
- "jsonrpc": "2.0",
- "result": [
- {
- "proxy_hostid": "0",
- "host": "a-proxy",
- "status": "5",
- "disable_until": "0",
- "error": "",
- "available": "0",
- "errors_from": "0",
- "lastaccess": "1637806905",
- "ipmi_authtype": "-1",
- "ipmi_privilege": "2",
- "ipmi_username": "",
- "ipmi_password": "",
- "ipmi_disable_until": "0",
- "ipmi_available": "0",
- "snmp_disable_until": "0",
- "snmp_available": "0",
- "maintenanceid": "0",
- "maintenance_status": "0",
- "maintenance_type": "0",
- "maintenance_from": "0",
- "ipmi_errors_from": "0",
- "snmp_errors_from": "0",
- "ipmi_error": "",
- "snmp_error": "",
- "jmx_disable_until": "0",
- "jmx_available": "0",
- "jmx_errors_from": "0",
- "jmx_error": "",
- "name": "",
- "flags": "0",
- "templateid": "0",
- "description": "a-proxy",
- "tls_connect": "1",
- "tls_accept": "1",
- "tls_issuer": "",
- "tls_subject": "",
- "tls_psk_identity": "",
- "tls_psk": "",
- "proxy_address": "1.1.1.1",
- "auto_compress": "1",
- "discover": "0",
- "proxyid": "10385",
- "interface": []
- },
- {
- "proxy_hostid": "0",
- "host": "b-proxy",
- "status": "5",
- "disable_until": "0",
- "error": "",
- "available": "0",
- "errors_from": "0",
- "lastaccess": "1637806906",
- "ipmi_authtype": "-1",
- "ipmi_privilege": "2",
- "ipmi_username": "",
- "ipmi_password": "",
- "ipmi_disable_until": "0",
- "ipmi_available": "0",
- "snmp_disable_until": "0",
- "snmp_available": "0",
- "maintenanceid": "0",
- "maintenance_status": "0",
- "maintenance_type": "0",
- "maintenance_from": "0",
- "ipmi_errors_from": "0",
- "snmp_errors_from": "0",
- "ipmi_error": "",
- "snmp_error": "",
- "jmx_disable_until": "0",
- "jmx_available": "0",
- "jmx_errors_from": "0",
- "jmx_error": "",
- "name": "",
- "flags": "0",
- "templateid": "0",
- "description": "b-proxy",
- "tls_connect": "1",
- "tls_accept": "1",
- "tls_issuer": "",
- "tls_subject": "",
- "tls_psk_identity": "",
- "tls_psk": "",
- "proxy_address": "1.1.1.1",
- "auto_compress": "1",
- "discover": "0",
- "proxyid": "10402",
- "interface": []
- },
- {
- "proxy_hostid": "0",
- "host": "c_proxy",
- "status": "5",
- "disable_until": "0",
- "error": "",
- "available": "0",
- "errors_from": "0",
- "lastaccess": "1637806905",
- "ipmi_authtype": "-1",
- "ipmi_privilege": "2",
- "ipmi_username": "",
- "ipmi_password": "",
- "ipmi_disable_until": "0",
- "ipmi_available": "0",
- "snmp_disable_until": "0",
- "snmp_available": "0",
- "maintenanceid": "0",
- "maintenance_status": "0",
- "maintenance_type": "0",
- "maintenance_from": "0",
- "ipmi_errors_from": "0",
- "snmp_errors_from": "0",
- "ipmi_error": "",
- "snmp_error": "",
- "jmx_disable_until": "0",
- "jmx_available": "0",
- "jmx_errors_from": "0",
- "jmx_error": "",
- "name": "",
- "flags": "0",
- "templateid": "0",
- "description": "c_proxy",
- "tls_connect": "1",
- "tls_accept": "1",
- "tls_issuer": "",
- "tls_subject": "",
- "tls_psk_identity": "",
- "tls_psk": "",
- "proxy_address": "1.1.1.1",
- "auto_compress": "1",
- "discover": "0",
- "proxyid": "10445",
- "interface": []
- }
- ],
- "id": 1
- }
再次过滤,找到lastaccess字段,该字段的值表示 proxy 当前的 时间戳,每5秒 内会发生变化;通过对比该字段的值 和 zabbix-server 当前时间戳的 时间差,来判断 proxy 的状态是否正常
添加监控项:
获取 lastaccess字段 的值
创建监控脚本:
- [root@sre ~]# cd /etc/zabbix/zabbix_agentd.d
- [root@sre zabbix_agentd.d]# vim a-proxy-check.sh
- #!/bin/bash
- curl -s -X POST -H 'Content-Type: application/json' -d '
- {
- "jsonrpc": "2.0",
- "method": "proxy.get",
- "params": {
- "output": "extend",
- "selectInterface": "extend"
- },
- "auth": "0qwewerwrsdfdsfdsafsd",
- "id": 1
- }' http://172.16.10.37:8888/api_jsonrpc.php | awk -F '{"' '{print $3}' | awk -F ',' '{print $8}' | awk -F '"' '{print $4}'
- [root@sre zabbix_agentd.d]# chmod +x /etc/zabbix/zabbix_agentd.d/a-proxy-check.sh
修改 zabbix_agentd.conf 配置文件,创建KEY ,指定脚本路径
- [root@sre ~]# vim /etc/zabbix/zabbix_agentd.conf
- ........
- UnsafeUserParameters=1
- #自定义一个key,监控a_proxy的状态
- UserParameter=a_proxy_status,/bin/bash /etc/zabbix/zabbix_agentd.d/a-proxy-check.sh
- ........
重启 zabbix-agent
- systemctl restart zabbix-agent
登录到zabbix-server 使用zabbix-get 测试监控项
- [root@sre zabbix]# zabbix_get -s 172.16.10.37 -p 10050 -k "a_proxy_status"
- 1637923240
登录到zabbix web 控制台,
#添加监控项
#添加触发器
触发器表达式表示: 当a-proxy 时间戳的 值 和 zabbix 当前时间戳的值 ,相差60 秒 ,触发告警
修改触发器值,模拟触发告警
总结: 本篇文能写出来也是被逼无奈,单向网络,跨互联网监控,也是生产上一个案例吧.