services unknown on overloaded servers

Hello everyone,

I have a problem with two supervised servers which often have a fairly high load.

It seems that when the server works a lot I lose SNMP returns which puts all of my services in UNKNOWN state (which is quite painful because the ghost notifications follow one another without much relevance).

Do you have any idea what could be done to prevent this?

Page 1 / 1

Hello,

You may want to try increasing the snmp-timeout value. By default it’s one second and that can be hard for an overloaded server to answer within this timeframe.

Add for example --snmp-timeout=3 in the EXTRAOPTIONS macro at the Host level.

Let me know if it’s better.

Thank you for your answer, setting the Timeout to 3 greatly improves the amount of emails Centron sends me even if it still sends me a few false positives.

Can increasing the timeout value further help me?
------------------------------------------------------------------------------

Regarding the return of services, however, I have not really improved.

Does this problem of Timeout on overloaded servers make sense to you?

Yes it makes sense.

You can also try to use the snmp-retries option. I would recommend setting it to --snmp-retries=10 (twice the default value).

Also, I see that the last check time is the same for every check, I would recommend not forcing check on every items at the same time.

So, if I summarize:

add the option above to your Host(s) after the --snmp-timeout previously added.
don’t force all checks at the same time, do one forced check on each service keeping at least 30 seconds interval.

Should do the trick

OK for me.

Thank’s a lot !

Reply

Sign up

Login to the community

Scanning file for viruses.

This file cannot be downloaded