Hello, just here to share to the community an issue that i encountered, the underlying issue, and how it can be resovled.
This happened on a quite heavily loaded central server (poller included), and my end users reported receiving load of emails alerts regularly, but during bursts of a few minutes.
After investigating the commands failing, there was nothing to correleate them, any service could fail, and all failed with the same output: “Execute command failed”. Something strange was at work here.
After going more in the logs, especially centreon-engine logs, i find it. “Too many open files”. Here is the issue.
At first i supposed there was permissions issues or something, but this is irrelevent as all can happend to any commands usually working fine.
So i dig deeper on the internet, and here i find it, the Linux File Descriptor Limit. cyberciti.biz - linux increase the maximum number of open files . This limit by default is around 1024.
I then do a quick
ls /proc/[centengine_pid]/fs/*and there i see it, hundreads of pipe files created by so many many commands running on the server.
It seems that it happends that the limit was sometimes reached and then all next commands would fail in a sequence.
The solution here is to increase the centreon-engine file opened limit in the systemd service. It is available at /usr/lib/systemd/system/centengine.service (for me) and the default values can be found at centreon-collect/engine/scripts/centengine.service.in at 25.10-latest · centreon/centreon-collect
You will need to add the next block of code
[Service]
#... old configuration entries
LimitNOFILE=4096
It will add the LimitNOFILE to the Service section to increase that limit to 4096 (or whatever value you need, but if you need more … maybe blame it on the command programs being run).
Just dont forget about running the next commands to apply the configuration
systemctl daemon-reload
systemctl restart centengine
Another solution is also to reduce the amount of concurent commands being run within the centreon engine. This can be changed in the tunning tab of the centreon engine config throught the WEB UI. Another one is simply to reduce the load by using a dedicated poller.
So here you go, a small problematic issue i encountered, hope this helps.
Related issues on watch:
Service check command execution failed: pipe creation failed: Too many open files | Community
