nagios图形监控显示和报警管理.docx
《nagios图形监控显示和报警管理.docx》由会员分享,可在线阅读,更多相关《nagios图形监控显示和报警管理.docx(10页珍藏版)》请在冰豆网上搜索。
nagios图形监控显示和报警管理
nagios图形监控显示和报警管理[服务端]
1、yum安装pnp软件需要基础包
图形依赖库
yum install cairo pango zlib zlib-devel freetype freetype-devel gd gd-devel -y
2、rrdtools依赖安装软件都在oldboy_training_nagios_soft.zip
cd~/tools
tar xf libart_lgpl-2.3.17.tar.gz
cd libart_lgpl-2.3.17
./configure
make
make install
/bin/cp -r /usr/local/include/libart-2.0 /usr/include
cd ../
3、安装rrdtool软件轮循数据库,专门画图
tar xf rrdtool-1.2.14.tar.gz
cd rrdtool-1.2.14
./configure --prefix=/usr/local/rrdtool --disable-python --disable-tcl
#WARNING:
The RRDs Perl Modules are not found on your System
#Using RRDs will speedup things in larger Installtions.
#configure后出现上面的提示可以不用理会。
make
make install
cd ../
ls -l /usr/local/rrdtool/bin
4、安装pnppnp收集数据给rrdtool画图rrdtool画完再给pnp展示
tar zxf pnp-0.4.14.tar.gz
cd pnp-0.4.14
./configure \
--with-rrdtool=/usr/local/rrdtool/bin/rrdtool \ #<==真正的出图的命令
--with-perfdata-dir=/usr/local/nagios/share/perfdata/ #<==出图所用的数据路径
#################
# WARNING:
The RRDs Perl Modules are not found on your System
# Using RRDs will speedup things in larger Installtions.
#####################
make all
make install
make install-config
make install-init
ll /usr/local/nagios/libexec/ |grep process
排错的:
tar zxf pnp-0.4.14.tar.gz
cd pnp-0.4.14
./configure \
--with-rrdtool=/usr/local/rrdtool/bin/rrdtool --with-perfdata-dir=/usr/local/nagios/share/perfdata/
#################
# WARNING:
The RRDs Perl Modules are not found on your System
# Using RRDs will speedup things in larger Installtions.
#####################
make all
make install
make install-config
make install-init
ll /usr/local/nagios/libexec/ |grep process
问题:
configure报错
checking for linker flags for loadable modules... -shared
checking for Perl Module Time:
:
HiRes... no
configure:
error:
Perl Module Time:
:
HiRes not available
解决:
yum install perl-Time-HiRes –y(一般不会发生)
5、
cd/usr/local/nagios/etc/
cpnagios.cfgnagios.cfg.ori
vinagios.cfg+835
833 process_performance_data=1
834
835
836
837 # HOST AND SERVICE PERFORMANCE DATA PROCESSING COMMANDS
838 # These commands are run after every host and service check is
839 # performed. These commands are executed only if the
840 # enable_performance_data option (above) is set to 1. The command
841 # argument is the short name of a command definition that you
842 # define in your host configuration file. Read the HTML docs for
843 # more information on performance data.
844
845 host_perfdata_command=process-host-perfdata#取消注释
846 service_perfdata_command=process-service-perfdata
删除之前的内容
删除两个命令定义:
process-host-perfdata、 process-service-perfdata
添加如下
# 'process-host-perfdata' command definition
define command{
command_name process-host-perfdata
command_line /usr/local/nagios/libexec/process_perfdata.pl
}
# 'process-service-perfdata' command definition
define command{
command_name process-service-perfdata
command_line /usr/local/nagios/libexec/process_perfdata.pl
}
检查语法,重启服务
/etc/init.d/nagioscheckconfig
/etc/init.d/nagiosreload
/etc/init.d/httpdstart
访问:
http:
//10.0.0.11/nagios/pnp/index.php
主机出图在
vihosts.cfg
definehost{
uselinux-server
host_name204-zhuangjiajun
alias204-zhuangjiajun
address10.0.0.204
action_url/nagios/pnp/index.php?
host=$HOSTNAME$
}
也可以添加在模板
vitemplates.cfg
搜索/linux-server
definehost{
namelinux-server;Thenameofthishosttemplate
check_period24x7;Bydefault,Linuxhostsarechec
kedroundtheclock
check_interval5;Activelycheckthehostevery5
minutes
retry_interval1;Schedulehostcheckretriesat1
minuteintervals
max_check_attempts10;CheckeachLinuxhost10times(
max)
check_commandcheck-host-alive;DefaultcommandtocheckLinux
hosts
notification_periodworkhours;Linuxadminshatetobewokenup
soweonlynotifyduringtheday
;Notethatthenotification_perio
dvariableisbeingoverriddenfrom
;thevaluethatisinheritedfrom
thegeneric-hosttemplate!
notification_interval120;Resendnotificationsevery2hou
rs
notification_optionsd,u,r;Onlysendnotificationsforspec
ifichoststates
contact_groupsadmins;Notificationsgetsenttothead
minsbydefault
register0;DONTREGISTERTHISDEFINITION-
ITSNOTAREALHOST,JUSTATEMPLATE!
action_url/nagios/pnp/index.php?
host=$HOSTNAME$
}
效果:
服务出图
viservices.cfg
defineservice{
usegeneric-service
host_name204-zhuangjiajun
service_descriptionmemory
check_commandcheck_nrpe!
check_memory.pl
action_url/nagios/pnp/index.php?
host=$HOSTNAME$&srv=$SERVICEDESC$
}
同样也可以配置到模板
vitemplates.cfg
搜索/service
defineservice{
namegeneric-service;The'name'ofthisservicetemplate
active_checks_enabled1;Activeservicechecksareenabled
passive_checks_enabled1;Passiveservicechecksareenabled/accepted
parallelize_check1;Activeservicechecksshouldbeparallelized(disablingthiscanleadtomajorperformanceproblems)
obsess_over_service1;Weshouldobsessoverthisservice(ifnecessary)
check_freshness0;DefaultistoNOTcheckservice'freshness'
notifications_enabled1;Servicenotificationsareenabled
event_handler_enabled1;Serviceeventhandlerisenabled
flap_detection_enabled1;Flapdetectionisenabled
failure_prediction_enabled1;Failurepredictionisenabled
process_perf_data1;Processperformancedata
retain_status_information1;Retainstatusinformationacrossprogramrestarts
retain_nonstatus_information1;Retainnon-statusinformationacrossprogramrestarts
is_volatile0;Theserviceisnotvolatile
check_period24x7;Theservicecanbecheckedatanytimeoftheday
max_check_attempts3;Re-checktheserviceupto3timesinordertodetermineitsfinal(hard)state
normal_check_interval10;Checktheserviceevery10minutesundernormalconditions
retry_check_interval2;Re-checktheserviceeverytwominutesuntilahardstatecanbedetermined
contact_groupsadmins;Notificationsgetsentouttoeveryoneinthe'admins'group
notification_optionsw,u,c,r;Sendnotificationsaboutwarning,unknown,critical,andrecoveryevents
notification_interval60;Re-notifyaboutserviceproblemseveryhour
notification_period24x7;Notificationscanbesentoutatanytime
register0;DONTREGISTERTHISDEFINITION-ITSNOTAREALSERVICE,JUSTATEMPLATE!
action_url/nagios/pnp/index.php?
host=$HOSTNAME$&srv=$SERVICEDESC$
}
实现效果:
nagios报警
邮件报警
邮件转短信
短信网关=========》老男孩推荐
如果有值班页面显示,由人打电话
微信绑定邮箱
对于不紧急的选择邮件报警,重要紧急的报警选择邮件+短信
老男孩思想:
花一定的费用,把业务做到最好,如果报警报不出来,损失更大
重要报警思想
该报的报出来,不该报的一点不要报出来
配置报警步骤
1、开发短信报警脚本(短信网关需要收费的)
[root@oldboy-A libexec]# pwd
/usr/local/nagios/libexec
[root@oldboy-A libexec]# cat sms_send
#!
/bin/sh
PROGNAME=`basename $0`
PROGPATH=`echo $0 | sed -e 's,[\\/][^\\/][^\\/]*$,,'`
print_usage() {
echo "Usage:
"
echo "/bin/sh $PROGNAME title contact"
exit 1
}
if [ $# -ne 2 ];then
print_usage
fi
alert_date=$(date +%y-%m-%d" "%H:
%M)
TITLE=$1 #FORMAT "Host $HOSTSTATE$ alert for $HOSTNAME$"
CONTACT=$2
#curl方式
curl -d cdkey=3RTY-EMY-0980-MTUQ2 -d password=189162 -d phone=$CONTACT -d message="$TITLE[${alert_date} oldboysa]"
#wget --quiet "
http:
//s.ccme.cc/qxt/send.jsp?
circle=159net_131&pwd=oldboy123&mobile=18911718229&service=f1fb0546-ebb6-0987-8f20-560524c1f88d&msgid=3956724&message=$TITLE[${alert_date} oldboysa n]"
2、添加联系人联系组contacts.cfg
3、添加报警的命令commands.cfg
4、调整联系人模板,添加报警的命令(来自于commands.cfg命令)(逗号,后面再加命令)
5、host.cfg,services.cfg添加报警联系人及组,或者对应模板加
contact_groupsadmin,sa
本周作业:
1、监控RAID,CPU温度
2、自定义插件出图
3、完成cacti部署,出图