I have 3 DB2 servers, all have the same processes (only the DB name are different), they all have the same process check (based on a service template).The checks worked well for several months, from a few days the check is CRITICAL on 1 server whereas the process is still present. I can’t find why. Here is the check command from the poller :/usr/lib64/nagios/plugins/check_centreon_nrpe3 -H XXX0002.domain -p xxxx -t30 -u -c check_centreon_plugins -a ‘os::linux::local::plugin’ ‘process’ ‘--warning-total= --critical-total=1:1 --warning-time= --critical-time= --filter-command="^db2sysc$" --filter-arg="0" --filter-state="" ‘For XXX0002 it returns :CRITICAL: Number of current processes: 0 | 'processes.total.count'=0;;1:1;0;For XXX0001 and XXX0003 it returns :OK: Process: [command => db2sysc] [arg => db2sysc 0] [state => S] duration: 1y 7M 12h 30m 31s - Number of current processes: 1 | 'processes.total.count'=1;;1:1;0; When I execute the bash command used by the check, i.e.ps -e -o state -o ===%t===%p===%P=== -o comm:50 -o ===%a -w 2>&1then grep “db2sysc”, I can see (on all servers) the DB2 process, including the one I want to monitor :S===578-19:49:21===2471=== 1===db2syscr ===db2wdog 0 [db2wol]S===578-19:49:21===2476=== 2471===db2sysc ===db2sysc 0S===578-19:49:21===2482=== 2471===db2syscr ===db2ckpwd 0S===578-19:49:21===2483=== 2471===db2syscr ===db2ckpwd 0S===578-19:49:21===2484=== 2471===db2syscr ===db2ckpwd 0it’s all the same on the 3 servers, the DB are running perfectly.Why 1 check of 3 could have gone wrong ? 😓 Centreon 24.04Check from plugin “OS Linux”, mode process. Thx for help by advance

check OS Linux Process : false "critical"

I have 3 DB2 servers, all have the same processes (only the DB name are different), they all have the same process check (based on a service template).

The checks worked well for several months, from a few days the check is CRITICAL on 1 server whereas the process is still present. I can’t find why.

Here is the check command from the poller :

/usr/lib64/nagios/plugins/check_centreon_nrpe3 -H XXX0002.domain -p xxxx -t30 -u -c check_centreon_plugins -a ‘os::linux::local::plugin’ ‘process’ ‘--warning-total= --critical-total=1:1 --warning-time= --critical-time= --filter-command="^db2sysc$" --filter-arg="0" --filter-state="" ‘

For XXX0002 it returns :

CRITICAL: Number of current processes: 0 | 'processes.total.count'=0;;1:1;0;

For XXX0001 and XXX0003 it returns :

OK: Process: ecommand => db2sysc] =arg => db2sysc 0] gstate => S] duration: 1y 7M 12h 30m 31s - Number of current processes: 1 | 'processes.total.count'=1;;1:1;0;

When I execute the bash command used by the check, i.e.

ps -e -o state -o ===%t===%p===%P=== -o comm:50 -o ===%a -w 2>&1

then grep “db2sysc”, I can see (on all servers) the DB2 process, including the one I want to monitor :

S===578-19:49:21===2471===    1===db2syscr                                ===db2wdog 0 �db2wol]
S===578-19:49:21===2476=== 2471===db2sysc                                ===db2sysc 0
S===578-19:49:21===2482=== 2471===db2syscr                                ===db2ckpwd 0
S===578-19:49:21===2483=== 2471===db2syscr                                ===db2ckpwd 0
S===578-19:49:21===2484=== 2471===db2syscr                                ===db2ckpwd 0

it’s all the same on the 3 servers, the DB are running perfectly.

Why 1 check of 3 could have gone wrong ?

Centreon 24.04

Check from plugin “OS Linux”, mode process.

Thx for help by advance

Be the first to reply!

Reply

Sign up

Login to the community

Scanning file for viruses.

This file cannot be downloaded