Skip to main content

Hi,

I’m currently running a Centreon 23.04.7 on Debian 11.

After a server reboot, centreon-broker daemon won’t start. This occurs “maybe” after a disk full because of /var/lib/centreon/metrics taking full space of my HDD, and i’ve done a deletion of all files in this directory.

Looking at my service logs, the service starts in loop :

● cbd.service - Centreon Broker watchdog
Loaded: loaded (/lib/systemd/system/cbd.service; enabled; vendor preset: enabled)
Active: active (running) since Sat 2023-09-23 15:56:18 CEST; 9min ago
Main PID: 1131 (cbwd)
Tasks: 3 (limit: 9492)
Memory: 1.6M
CPU: 1.052s
CGroup: /system.slice/cbd.service
└─1131 /usr/sbin/cbwd /etc/centreon-broker/watchdog.json

sept. 23 16:05:10 centreon cbwd11131]: e2023-09-23T16:05:10.263+02:00] 5cbwd] 3error] cbd instance with PID 6824 has stopped, attempt to restart it

In /var/log/messages, i’ve got this beautiful segfault error :

centreon kernel: >  338.996927] cbwd 4904]: segfault at 0 ip 00007f9a120f2334 sp 00007ffdab1adfe0 error 4 in launcher.preload.soa7f9a120f2000+2000]

After looking at ALL logs that are available, the only one that is written about centreon-broker is /var/log/centreon-broker/watchdog.log which print that in loop :

p2023-09-23T16:04:55.245+02:00] :cbwd] 2info] Starting progress 'central-rrd-master'
-2023-09-23T16:04:55.245+02:00] :cbwd] 2info] Process 'central-rrd-master' started (PID 6661)
d2023-09-23T16:04:55.245+02:00] :cbwd] 2error] cbd instance with PID 6660 has stopped, attempt to restart it
o2023-09-23T16:04:55.245+02:00] :cbwd] 2info] Starting progress 'central-broker-master'
o2023-09-23T16:04:55.245+02:00] :cbwd] 2info] Process 'central-broker-master' started (PID 6662)
d2023-09-23T16:04:55.245+02:00] :cbwd] 2error] cbd instance with PID 6661 has stopped, attempt to restart it

Looks like a start loop of the service.

I’ve tried everything, to change ownership of configuration files to reinstall the whole centreon-broker package. Nothing works, i’m just stuck right now.

Does someone have already encounter this experience and might have a little hint for me ? I’m running out of ideas…

Thanks a lot !

Thomas

After doing a coredump analysis on the cbwd process, it looks like it crashed at an array length, and the code 0 on the segfault error indicates a null pointer dereference.

 

coredumpctl dump 45546 --output 45546.core
PID: 45546 (cbwd)
UID: 111 (centreon-broker)
GID: 118 (centreon-broker)
Signal: 11 (SEGV)
Timestamp: Sat 2023-09-23 17:09:01 CEST (4min 30s ago)
Command Line: /usr/sbin/cbwd /etc/centreon-broker/watchdog.json
Executable: /usr/sbin/cbwd
Control Group: /system.slice/cbd.service
Unit: cbd.service
Slice: system.slice
Boot ID: f129aba73cc9492c8332b20d5aaaa0ea
Machine ID: 9a7d6ee142444c3aa49386fa8938bf9c
Hostname: centreon
Storage: /var/lib/systemd/coredump/core.cbwd.111.f129aba73cc9492c8332b20d5aaaa0ea.45546.1695481741000000.zst
Message: Process 45546 (cbwd) of user 111 dumped core.

Stack trace of thread 45546:
#0 0x00007f5d04252334 arr_len (launcher.preload.so + 0x1334)
#1 0x00007f5d0425253e execve (launcher.preload.so + 0x153e)
#2 0x00005626d1c3ca73 n/a (cbwd + 0x32a73)
#3 0x00005626d1c18c44 n/a (cbwd + 0xec44)
#4 0x00007f5d03e8dd0a __libc_start_main (libc.so.6 + 0x23d0a)
#5 0x00005626d1c18eda n/a (cbwd + 0xeeda)

gdb /usr/sbin/cbwd 45546.core
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007f5d04252334 in arr_len (arr=0x0) at ../preload_go_c/preload.c:21

 


After reinstalling the whole Centreon, i’ve found that it is due to the installation of the Datadog agent on the same host as Centreon. (It was used to monitor the Centreon host). I don’t know why it does that, but if anybody is in the same situation than me, don’t install Datadog agent on the same host.


Reply