配置nagios客户端
一、客户端安装配置
1、客户端安装:
- wget http://blog.vlvtu.com/nrpe_install.zip
- tar zxvf nrpe_install
- cd nrpe_install
- ./setup.sh
2、添加启动项:
- echo "/usr/local/nrpe/bin/nrpe -c /usr/local/nrpe/etc/nrpe.cfg -d" >> /etc/rc.d/rc.local
3、启动:
- /usr/local/nrpe/bin/nrpe -c /usr/local/nrpe/etc/nrpe.cfg -d
4、检查:
- tail -f /var/log/message
- Oct 20 16:19:38 webhost2 nrpe[3782]: Starting up daemon
- Oct 20 16:19:38 webhost2 nrpe[3782]: Listening for connections on port 5666
- Oct 20 16:19:38 webhost2 nrpe[3782]: Allowing connections from: 127.0.0.1,192.168.1.11
5、测试NEPE本地是否正常启动
- /usr/local/nrpe/libexec/check_nrpe -H 192.168.1.11
- NRPE v2.12
6、防火墙配置方法:
- iptables -A FORWARD -i eth0 -p tcp –dport 5666 -j ACCEPT
#p#
二、在监控主机上配置NRPE的服务
1.查看新扩展插件check_nrpe的使用方法
- /usr/local/nagios/libexec/check_nrpe -h|less
- Usage: check_nrpe -H [-n] [-u] [-p ] [-t ] [-c ]
使用方式:check_nrpe -H 主机名 -p NRPE端口 -c NRPE命令名
选项:
- = The address of the host running the NRPE daemon
主机,运行着NRPE守护进程的远程被监测主机名,并且该主机名必须在host里定义过。
- [port] = The port on which the daemon is running (default=5666)
端口,被监测的远程主机上运行NRPE的端口,默认是5666,如果是默认就不用指定。
- [command] = The name of the command that the remote daemon should run
命令,这些命令名必须是被监测主机上NRPE守护进程运行着的。
查看监控服务器是否能和远程Linux 192.168.1.11正常通信。能正常通信返回 NRPE的版本号
- /usr/local/nagios/libexec/check_nrpe -H 192.168.1.11
2.在command.cfg命令定义文件中添加NRPE命令。
- vi /usr/local/nagios/etc/commands.cfg
- # NRPE Command
最下方添加NRPE功能命令。
- #check nrpe
- define command{
- command_name check_nrpe
- command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
- }
这里要说明几点:
(1)这里定义的命令名就叫作nrpe。
(2)$USER1$/check_nrpe会通过引用resource.cfg获得/usr/local/nagios/libexec/check_nrpe这个绝对路径。
(3)-H $HOSTADDRESS$ 用来获得指定被监测主机的IP地址,$HOSTADDRESS$变量会通过定义主机名查找到host段中的IP地址。
(4)-c $ARG1$ 用来指定被监测主机上NRPE守护进程运行着的NRPE命令名。
3.在Nagios监控服务器上然后按照NRPE命令定义来添加NRPE远程监控主机和服务:
hosts.cfg 添加定义主机
- vi /usr/local/nagios/etc/object/hosts.cfg
- define host {
- host_name nagios-server
- alias nagios server
- address 202.96.155.155
- contact_groups sagroup
- check_command check-host-alive
- max_check_attempts 3
- notification_interval 10
- notification_period 24×7
- notification_options d,u,r
- }
- define host{
- use linux-server
- host_name test-cnname server
- alias test-cnname server
- address 202.96.155.155
- contact_groups sagroup
- check_command check-host-alive
- max_check_attempts 3
- notification_interval 10
- notification_period 24×7
- notification_options d,u,r
- }
#p#
测试配置文件
- [root@app ~]# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
………………………………..
- Checking services…
- Checked 5 services.
- Checking hosts…
- Warning: Host '202.96.155.155' has no services associated with it!
- Checked 2 hosts.
- Checking host groups…
- Total Warnings: 1
- Total Errors: 0
有一警告,没配置services.cfg
- vi /usr/local/nagios/etc/objects/services.cfg
- #address 192.168.1.11为远程Linux服务器的地址
实现监控192.168.1.11服务器硬盘使用情况
- vi /usr/local/nagios/etc/objects/services.cfg
- define service{
- host_name test-cnname server
- service_description check-disk
- check_command check_nrpe!check_df
- max_check_attempts 4
- normal_check_interval 3
- retry_check_interval 2
- check_period 24×7
- notification_interval 10
- notification_period 24×7
- notification_options w,u,c,r
- contact_groups sagroup
- }
通过这样的方法可以在hosts文件里添加更多的服务器
运行如下命令。如果没有错误就可以重启Nagios 服务
- /usr/local/nagios/bin/nagios –v /usr/local/ngaios/etc/nagios.cfg
重启 Nagios Service 使配置生效
- service nagios reload
【编辑推荐】