How to use the thresholds

  • 13 June 2022
  • 2 replies
  • 1835 views

Userlevel 2
Badge +3

 In this article, we will describe what a threshold is, and how to set it up.
 

 

What is a thresholds?

You must be aware that centreon use plugins to check if everything is fine on a host or services with commands, but just like that it will just give us the information without alerting anything.

A threshold is a value that set when a data will be problematic, for example if we want to check the space on a disk, we can put threshold to say “ok if the disk is filled with more than 80% of his total space it’s a problem so tell me when it happen”.
seems trivial said like that but it’s a very important thing to do on host and services, without that we can’t know without checking by hand every host and services if there’s a problem or not and so the supervision would be pretty useless.

 

How does it work?
 

1- You can set it up just by doing it on the web UI
 

 

setting threshold

 On the custom macro part, you can just enter the value that will be the threshold, warning or critical.

 

We need to precise that the thresholds i’m talking about doesn’t work on macro containing “status” and being strings that need to be set in a different way like that:

 

CRITICALSTATUS= %{status} !~ /active/i

 

By the way if it’s not specific to one service or host you can set it up on the template directly.

 

2- Different type of threshold​​​​​​​

  • Outside the range {0:X}

First the most common just put the value like in the screen above, it would mean that if the data is outside of from the range between 0 and your threshold (X) then it would trigger a change of state.

[root@poller1 ~]# /usr/lib/centreon/plugins//centreon_linux_snmp.pl --plugin=os::linux::snmp::plugin --mode=cpu --hostname=svlinuxpar.centreon.training --snmp-version='2c' --snmp-community='os_linux' --snmp-autoreduce --warning-average='80'  --critical-average='90'

OK: 4 CPU(s) average usage is 3.00 % | 'total_cpu_avg'=3.00%;0:80;0:90;0;100 'cpu_0'=4.00%;;;0;100 'cpu_1'=3.00%;;;0;100 'cpu_2'=2.00%;;;0;100 'cpu_3'=3.00%;;;0;100

 

  • Under the threshold

Then you can try putting colons at the end of your value, this way it will change the state of the service or host if the data value is lower than X so outside of the range between X and ∞.

[root@poller1 ~]# /usr/lib/centreon/plugins//centreon_linux_snmp.pl --plugin=os::linux::snmp::plugin --mode=cpu --hostname=svlinuxpar.centreon.training --snmp-version='2c' --snmp-community='os_linux' --snmp-autoreduce --warning-average=''  --critical-average='90:'

CRITICAL: 4 CPU(s) average usage is 4.50 % | 'total_cpu_avg'=4.50%;;90:;0;100 'cpu_0'=1.00%;;;0;100 'cpu_1'=6.00%;;;0;100 'cpu_2'=5.00%;;;0;100 'cpu_3'=6.00%;;;0;100

 

  • ​​​​​​​Above the threshold

Let’s put the colons before your threshold and adding a tilde like that ~:X
it will be the opposite of before, now the service or host will change of status when your data is strictly superior to the threshold, outside of the interval between -∞ and X.

 

basically it’s the same as the classic one but the range of exclusion is larger, since the classic work for between 0 and X so if you have negative value it can trigger the change of state.

 

The tilde means the negative infinite if you’re wondering

[root@poller1 ~]# /usr/lib/centreon/plugins//centreon_linux_snmp.pl --plugin=os::linux::snmp::plugin --mode=cpu --hostname=svlinuxpar.centreon.training --snmp-version='2c' --snmp-community='os_linux' --snmp-autoreduce --warning-average=''  --critical-average='90:'

CRITICAL: 4 CPU(s) average usage is 4.50 % | 'total_cpu_avg'=4.50%;;90:;0;100 'cpu_0'=1.00%;;;0;100 'cpu_1'=6.00%;;;0;100 'cpu_2'=5.00%;;;0;100 'cpu_3'=6.00%;;;0;100

 

  • ​​​​​​​Outside the Range: Two number

Another way is to set two number in the threshold like that X: Y , like you can guess in this case the threshold will change the state if the data is outside of the range between X and Y for example if I set 10:20 it will alert if the data is in the interval {-∞ .. 10} or in the interval between 20 and ∞.

[root@poller1 ~]# /usr/lib/centreon/plugins//centreon_linux_snmp.pl --plugin=os::linux::snmp::plugin --mode=cpu --hostname=svlinuxpar.centreon.training --snmp-version='2c' --snmp-community='os_linux' --snmp-autoreduce --warning-average=''  --critical-average='90:'

CRITICAL: 4 CPU(s) average usage is 4.50 % | 'total_cpu_avg'=4.50%;;90:;0;100 'cpu_0'=1.00%;;;0;100 'cpu_1'=6.00%;;;0;100 'cpu_2'=5.00%;;;0;100 'cpu_3'=6.00%;;;0;100

 

  • ​​​​​​​​​​​​​​In range: The power of @

Finally you can set it up like that @X :Y , what’s new comparate to before you would ask. Like you saw everything before was setting for when a value is outside of a range but this time with the @ it’s inside of the range !! so it will change the state if the data is superior or equal to X and lower or equal to Y.

[root@poller1 ~]# /usr/lib/centreon/plugins//centreon_linux_snmp.pl --plugin=os::linux::snmp::plugin --mode=cpu --hostname=svlinuxpar.centreon.training --snmp-version='2c' --snmp-community='os_linux' --snmp-autoreduce --warning-average=''  --critical-average='@80:90'

OK: 4 CPU(s) average usage is 4.25 % | 'total_cpu_avg'=4.25%;;@80:90;0;100 'cpu_0'=5.00%;;;0;100 'cpu_1'=4.00%;;;0;100 'cpu_2'=3.00%;;;0;100 'cpu_3'=5.00%;;;0;100

 

I will just mention that you can set up negative threshold just by putting a dash before the value like “-10”, it’s useful in some metrics that can be negative like the temperature or on the storage service for windows SNMP plugin.

 

See also


2 replies

Userlevel 4
Badge +14

Great !

An another help about the syntax of threshold from Nagios Plugins Docs https://nagios-plugins.org/doc/guidelines.html#THRESHOLDFORMAT

Badge +5

Thanks for these great tips; I think you made a slight mistake with the command line examples for “Above the threshold”, “​​​​​​​Outside the Range: Two number”. Otherwise the explanations are on point!

Reply