Hi !
Can you get the executed command line from the details panel of the service in “Monitoring > Resource Status” ?
Then, execute the following command line : su - centreon-engine -c “<command>”
You should reproduce the issue.
This is indeed what is happening :O
12:48:20 root@centreon-21:/var/lib/centreon/centplugins# /usr/lib/centreon/plugins/centreon_linux_snmp.pl --plugin=os::linux::snmp::plugin --mode=interfaces --hostname=10.0.0.xxxxx --snmp-version='2c' --snmp-community='xxxxx' --interface='.*' --name --add-status --add-errors --critical-status='' --warning-in-discard='' --critical-in-discard='' --warning-out-discard='' --critical-out-discard='' --warning-in-error='' --critical-in-error='' --warning-out-error='' --critical-out-error='' --verbose
OK: All interfaces are ok
12:48:37 root@centreon-21:/var/lib/centreon/centplugins# su - centreon-engine -c "/usr/lib/centreon/plugins/centreon_linux_snmp.pl --plugin=os::linux::snmp::plugin --mode=interfaces --hostname=10.0.0.xxxxx --snmp-version='2c' --snmp-community='xxxxx' --interface='.*' --name --add-status --add-errors --critical-status='' --warning-in-discard='' --critical-in-discard='' --warning-out-discard='' --critical-out-discard='' --warning-in-error='' --critical-in-error='' --warning-out-error='' --critical-out-error='' --verbose"
UNKNOWN: No entry found (maybe you should reload cache file)
But not for all
12:44:25 root@centreon-21:/var/lib/centreon/centplugins# /usr/lib/centreon/plugins/centreon_linux_snmp.pl --plugin=os::linux::snmp::plugin --mode=interfaces --hostname=10.0.0.xxxxx --snmp-version='2c' --snmp-community='xxxxx' --interface='.*' --name --add-status --add-traffic --critical-status='' --warning-in-traffic='80' --critical-in-traffic='90' --warning-out-traffic='80' --critical-out-traffic='90' --verbose
OK: All interfaces are ok | 'traffic_in_lo'=6138.67b/s;0:8000000;0:9000000;0;10000000 'traffic_out_lo'=6138.67b/s;0:8000000;0:9000000;0;10000000 'traffic_in_ens160'=5099.78b/s;0:8000000000;0:9000000000;0;10000000000 'traffic_out_ens160'=2858.67b/s;0:8000000000;0:9000000000;0;10000000000
Interface 'lo' Status : up (admin: up), Traffic In : 6.14Kb/s (0.06%), Traffic Out : 6.14Kb/s (0.06%)
Interface 'ens160' Status : up (admin: up), Traffic In : 5.10Kb/s (0.00%), Traffic Out : 2.86Kb/s (0.00%)
12:44:34 root@centreon-21:/var/lib/centreon/centplugins# su - centreon-engine -c "/usr/lib/centreon/plugins/centreon_linux_snmp.pl --plugin=os::linux::snmp::plugin --mode=interfaces --hostname=10.0.0.xxxxx --snmp-version='2c' --snmp-community='xxxxx' --interface='.*' --name --add-status --add-traffic --critical-status='' --warning-in-traffic='80' --critical-in-traffic='90' --warning-out-traffic='80' --critical-out-traffic='90' --verbose"
OK: All interfaces are ok | 'traffic_in_lo'=5088.00b/s;0:8000000;0:9000000;0;10000000 'traffic_out_lo'=5088.00b/s;0:8000000;0:9000000;0;10000000 'traffic_in_ens160'=7000.89b/s;0:8000000000;0:9000000000;0;10000000000 'traffic_out_ens160'=2943.11b/s;0:8000000000;0:9000000000;0;10000000000
Interface 'lo' Status : up (admin: up), Traffic In : 5.09Kb/s (0.05%), Traffic Out : 5.09Kb/s (0.05%)
Interface 'ens160' Status : up (admin: up), Traffic In : 7.00Kb/s (0.00%), Traffic Out : 2.94Kb/s (0.00%)
I made some progress on the problem, it is indeed the cache files that cause me all my worries: unknown status, impossibility of rechecking a service from the web interface.
if I apply an rm -rf /var/lib/centreon/centplugins/* I have the ok status again for all of my services, the same goes for the recheck which starts again.
But when the cache files are recreated it makes the mess again.
So by creating a script that deletes files from /var/lib/centreon/centplugins in a loop I recovered the ok status of all my services.
But it's not clean at all to do like that, could centreon be prevented from writing cache files in this folder?
The source of the problem was in free space :
df -h
/dev/mapper/centos_centreon-var_lib_centreon 6,8G 6,4G 0 100% /var/lib/centreon
PROBLEM NOT SOLVED YET !
Do not close ticket please !
Hi
You can specify an other cache directory with the option --statefile-dir=<new_dir>
if you have lot of services with metrics, check /var/lib/centreon/metrics
is not full.
If yes, you will probably need to extend the disk.
My centreon server being virtualized and the disks configured in LVM I followed this tutorial to extend my partition /dev/sda2 in order to extend the mount /var/lib/centreon : Link
11:38:03 root@centreon-21:~# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 70G 0 disk
├─sda1 8:1 0 1000M 0 part /boot
└─sda2 8:2 0 69G 0 part
├─centos_centreon--central-root 253:0 0 20G 0 lvm /
├─centos_centreon--central-swap 253:1 0 4G 0 lvm 0SWAP]
├─centos_centreon--central-var_cache_centreon_backup 253:2 0 5G 0 lvm /var/cache/centreon/backup
├─centos_centreon--central-var_lib_centreon 253:3 0 11G 0 lvm /var/lib/centreon
├─centos_centreon--central-var_lib_centreon--broker 253:4 0 5G 0 lvm /var/lib/centreon-broker
├─centos_centreon--central-var_lib_mysql 253:5 0 16G 0 lvm /var/lib/mysql
└─centos_centreon--central-var_log 253:6 0 10G 0 lvm /var/log
After that I had to restart the server and delete the contents of /var/lib/centreon/centplugins one last time in order to regain normal behavior.
Thank’s to @kduret, as well as qgarnier and Sims24 who helped me from the Centreon Slack.