存档

文章标签 ‘nagios’

nrpe Error: Could not complete SSL handshake. 5

2013年11月14日 评论已被关闭

/var/log/messages里面有非常多的日志在刷屏
Nov 14 21:18:57 2hei.net nrpe[14855]: Error: Could not complete SSL handshake. 5

把网上的说法尝试变了,什么nrpe.conf中的allowed_hosts,什么nrpe版本等全是误导,没有一个好用!

于是发现了这个nrpe端口的自检check_tcp,果断去掉就好了。

终极solution:
修改nagios配置文件:
define service{
use oupeng
host_name 2hei.net
service_description check_tcp_5666
check_command check_tcp!5666
}

分类: OpenSource 标签:

nagios check_udp 就是一残废

2013年11月3日 评论已被关闭

nagios中check_udp是链接到的check_tcp
lrwxrwxrwx 1 root root 9 Sep 23 22:01 check_udp -> check_tcp

两个例子:
./check_udp -H 2hei.net -p 161 -s “” -e “” -w 2 -c 5
No data received from host

./check_udp -H 2hei.net -p 123 -s “” -e “” -w 2 -c 5
CRITICAL – Socket timeout after 10 seconds

/usr/lib64/nagios/plugins/check_udp -h
check_udp: No arguments found
Usage:
check_udp -H host -p port [-w ] [-c ] [-s ]
[-e ] [-q ][-m ] [-d ]
[-t ] [-r ] [-M ] [-v] [-4|-6] [-j]
[-D [,]] [-S ] [-E]

需要输入[-s ] [-e ]实际上很多ndp的端口就没有返回内容,太扯淡了。
看来还得自己动手丰衣足食。
阅读全文…

分类: OpenSource 标签:

nagios plugins check_proc_runtime

2013年6月3日 评论已被关闭

A nagios plugin of check proc running time. which can check a specified process running time, If it has been running too long time..

code:
check_proc_runtime

usage on nrpe:

$ cat nrpc.conf
command[check_proc_rsync]=/usr/local/nagios/libexec/check_proc_runtime -k rsync -e inotify -c 360

分类: python 标签: , ,

write nagios nrpe plugin

2010年7月1日 1 条评论

Scripts and executables must do two things (at a minimum) in order to function as Nagios plugins:
1.Exit with one of several possible return values
2.Return at least one line of text output to STDOUT

Plugin Return Code Service State Host State
0 OK UP
1 WARNING UP or DOWN/UNREACHABLE*
2 CRITICAL DOWN/UNREACHABLE
3 UNKNOWN DOWN/UNREACHABLE
Note: If the use_aggressive_host_checking option is enabled, return codes of 1 will result in a host
state of DOWN or UNREACHABLE. Otherwise return codes of 1 will result in a host state of UP.

Plugin Output Spec
At a minimum, plugins should return at least one of text output. Beginning with Nagios 3, plugins can
optionally return multiple lines of output. Plugins may also return optional performance data that can
be processed by external applications. The basic format for plugin output is shown below:
TEXT OUTPUT | OPTIONAL PERFDATA
LONG TEXT LINE 1
LONG TEXT LINE 2

LONG TEXT LINE N | PERFDATA LINE 2
PERFDATA LINE 3

PERFDATA LINE N

this is my python scripts:
#!/usr/bin/evn python
# -*- coding: utf-8 -*-

import sys,getopt
import memcache

memcached_host=’2hei.net’
memcached_port=11211
Warning_item=120
Critical_item=20

def usage():
    print “””
Usage: check_memcached [-h|–help] [-w|–warning curr_items] [-c|–critical curr_items]”
Warning curr_items defaults to 120
Critical curr_items defaults to 20
“””
    sys.exit(3)

#get curr_items from memcache stats
def get_memcache_curr_items(mc):
    #mc = memcache.Client([memcached_host+’:’+str(memcached_port)], debug=0)
    stats = mc.get_stats()[0][1]   
    #for i in xrange(0,100):
    #    mc.set(‘key’+str(i),’value’+str(i))
    #for k,v in stats.items():
    #    print k,v
    items = stats.get(‘curr_items’)
    return items

if __name__ == “__main__”:
    warning_item = 0
    critical_item = 0

    try:
        options, args = getopt.getopt(sys.argv[1:],”h:w:c:”,”–help –warning= –critical=”,)
    except getopt.GetoptError:
        usage()
        sys.exit(3)

    try:
        mc = memcache.Client([memcached_host+’:’+str(memcached_port)], debug=0)
        items = get_memcache_curr_items(mc)
        mc.disconnect_all()
    except Exception:
        print “Cannot get memcache’s curr_items.”,Exception
        sys.exit(3)

    for name, value in options:
        if name in (“-h”, “–help”):
            usage()
            sys.exit(3)
        if name in (“-w”, “–warning”):
            warning_item = value
        if name in (“-c”, “–critical”):
            critical_item = value

    if warning_item == 0:
        warning_item = Warning_item
    if critical_item == 0:
        critical_item = Critical_item

    if int(items) <= int(critical_item):
        print ‘MEMCACHED_ITEM CRITICAL: curr_items is:’,items
        sys.exit(2)
    if int(items) <= int(warning_item):
        print ‘MEMCACHED_ITEM WARNING: curr_items is:’,items
        sys.exit(1)
    else:
        print ‘MEMCACHED_ITEM OK: curr_items is:’,items
        sys.exit(0)

when encounter errors:
CHECK_NRPE: No output returned from daemon.
or
CHECK_NRPE: Received 0 bytes from daemon.  Check the remote server logs for error messages.
this shows your plugins return output is null

分类: OpenSource, python 标签:

nagios的check_ping使用

2008年9月17日 评论已被关闭

nagios的一些辅助工具很有用处,如check_ping,check_tcp等等,这里介绍一下check_ping的用法:

nagios的check_ping命令:
源码可见 nagios插件: nagios-plugins-1.4.12/plugins/check_ping.c
 
用法:
./check_ping 
Usage: check_ping -H <host_address> -w <wrta>,<wpl>% -c <crta>,<cpl>%
                     [-p packets] [-t timeout] [-L] [-4|-6]
具体如下:
-H    主机地址
-w    WARNING 状态:  响应时间(毫秒),丢包率 (%)   阀值
-c    CRITICAL状态:    响应时间(毫秒),丢包率 (%)   阀值
-p    发送的包数           默认5个包
-t     超时时间             默认10秒
-4|-6                        使用ipv4|ipv6 地址      默认ipv4
 
如:
1、正常:
./check_ping -H www.google.com -w 100.0,20% -c 200.0,50% -p 3 -t 2
PING OK – Packet loss = 0%, RTA = 1.49 ms
命令执行结果返回: echo $?   为 0
2、WARNING :
./check_ping -H www.google.com -w 0.1,20% -c 200.0,50% -p 3 -t 2
PING WARNING – Packet loss = 0%, RTA = 1.71 ms
命令执行结果返回: echo $?   为 1
3、CRITICAL
./check_ping -H www.google.com -w 0.1,20% -c 0.9,50% -p 3 -t 2
PING CRITICAL – Packet loss = 0%, RTA = 1.60 ms
命令执行结果返回: echo $?   为 2
 
返回结果为:状态  丢包率 ping响应时间
因为check_ping的返回值非常清晰,
可以在其他程序中调用check_ping命令,作为辅助的网络检测工具。
分类: OpenSource 标签:

learn shell check_tcpconn.sh from nagios

2008年8月25日 评论已被关闭

#check_tcpconn.sh
# warning value
W=1500
# critical value
C=2190
if [ -f /proc/net/tcp6 ]
        then
                TCP_FILE6=”/proc/net/tcp6″
fi
if [ -f /proc/net/tcp ]
        then
                TCP_FILE=”/proc/net/tcp”
fi
cat $TCP_FILE6 $TCP_FILE > /tmp/tcpstat
awk -v TOTAL_W=”$W” -v TOTAL_C=”$C” ‘BEGIN{ ESTABLISHED=TIME_WAIT=SYN_RECV=TOTAL=0}
                {if($4 ~/01/) {ESTABLISHED++ ; TOTAL++} else if($4 ~/06/) {TIME_WAIT++; TOTAL++} else if($4 ~/03/) {SYN_RECV++; TOTA
L++} else TOTAL++ }
                END{
                if (TOTAL < TOTAL_W)
                        {printf “OK CONN %s ESTABLISHED  %s TIME_WAIT  %s SYN_RECV  %s  TOTAL|CONN,%s,%s,%s,%s;”,ESTABLISHED,TIME_WA
IT,SYN_RECV,TOTAL,ESTABLISHED,TIME_WAIT,SYN_RECV,TOTAL ; exit 0}
                else if (TOTAL < TOTAL_C)
                        {printf “WARNING CONN %s ESTABLISHED  %s TIME_WAIT  %s SYN_RECV  %s  TOTAL|CONN,%s,%s,%s,%s;”,ESTABLISHED,TI
ME_WAIT,SYN_RECV,TOTAL,ESTABLISHED,TIME_WAIT,SYN_RECV,TOTAL ; exit 0}
                else
                        {printf “CRITICAL CONN %s ESTABLISHED  %s TIME_WAIT  %s SYN_RECV  %s  TOTAL|CONN,%s,%s,%s,%s;”,ESTABLISHED,T
IME_WAIT,SYN_RECV,TOTAL,ESTABLISHED,TIME_WAIT,SYN_RECV,TOTAL ; exit 0}

                }’ /tmp/tcpstat

this script check  /proc/net/tcp , u can find tcp connect status  .

cat /proc/net/tcp
enum {
  TCPF_ESTABLISHED = (1 << 1),
  TCPF_SYN_SENT  = (1 << 2),
  TCPF_SYN_RECV  = (1 << 3),
  TCPF_FIN_WAIT1 = (1 << 4),
  TCPF_FIN_WAIT2 = (1 << 5),
  TCPF_TIME_WAIT = (1 << 6),
  TCPF_CLOSE     = (1 << 7),
  TCPF_CLOSE_WAIT = (1 << 8),
  TCPF_LAST_ACK  = (1 << 9),
  TCPF_LISTEN    = (1 << 10),A
  TCPF_CLOSING   = (1 << 11),B
};

分类: OpenSource 标签: