Question

NSClient issue with cpu (physical machine with more than 64 threads)

  • 12 December 2022
  • 11 replies
  • 273 views

Userlevel 5
Badge +14

hello

we recently deployed some hyper-v node

they are running windows 2022 with centreon-nsclient 5.2.41 and they have more than 64 thread

 

the cpu check return abnormal value : 

 

this is probably a bug from the NSClient program, but has anyone encountered this problem and managed to fix it?

 

Thanks in advance


11 replies

Hi,

What command are you calling exactly ?

To troubleshot, you should look what command the NSClient++ agent runs, and run it locally on your Windows host to see what result it gives.

If it gives the same strange result then look at the script to know which underlying Windows command(s) which is run.

Userlevel 2
Badge +11

maybe try lodctr /R and restart your agent …

 

Userlevel 5
Badge +14

hello

I tried the lodctr /R trick before, didn’t do anything better (i even retried it)

 

here is the centreon command line

/usr/lib/centreon/plugins/centreon_nsclient_restapi.pl --plugin=apps::nsclient::restapi::plugin --mode=query --hostname='xxx' --port='8443' --proto='https' --legacy-password='xxx' --ssl-opt="SSL_verify_mode => SSL_VERIFY_NONE" --command=check_cpu --arg="warning=none" --arg="critical=time = '5m' and load > 95" --arg=show-all

it outuputs this:

OK: 5m: -138687045332%, 1m: -138471207623%, 5s: -330723515289% | 'total 5m'=-138687045332%;0;95;; 'total 1m'=-138471207623%;0;95;; 'total 5s'=-330723515289%;0;95;;

but it’s not the cause, as I have the same issue in debug/test mode locally

 

when I run “./nscp.exe test” locally on the server to get the nsclient shell and run this :

check_cpu show-all warning=none critical=load>90 critical=time='5m'

(i couldn’t run exactly the same command args in debug mode, but I have the same perf data)

the error is not present some time, but some time there is an error

 

even without args, just “check_cpu” gives a strange output

 

I have left an issue on the nscp github, it’s not a centreon developpment… but there has been no release for nsclient since a long time, but the maintainer seems active

Userlevel 2
Badge +11

Humm , i think maybe you have wrong agent NSCLIENT on your system .

Did you choose the 32bits one instead of 64 bit ?

Userlevel 5
Badge +14

the os is 64 bits, the agent is 64 bits, the issue is really specific, and as I said probably a bug from nsclient

this only happens on 5 servers out of 1500, but more importantly :

 it only happens on windows  2022 hardware server with more than 64threads (fresh servers we just deployed for a client)

we rarely have big server like that with windows, usually some other hypervisor,

 

I was asking originally if anyone else had encounter this issue and managed to solve it, or could confirm if this issue was general or just for me, or because there is a hyper-v kernel running on these machines, etc.. anything to help debug and identify the cause

 

the only place where it could go wrong is here:  https://github.com/mickem/nscp/blob/9886d7c066e3f10a45f4043bdbb7528d41204be2/include/win_sysinfo/win_sysinfo.cpp#L401 
there are double casting, and a substractions, or maybe windows return bad values, I can read it, but I can’t debug it, and this the most low level code that nsclient has to get cpu information, after that it’s windows api, and this function only get the information from api and is the only “computation” the values goes through.

as I said, I left an open issue on the nsclient  git.

Badge +5

@christophe.niel-ACT were you able to resolve this? We just noticed this after upgrading many of our Hyper-V hosts moving over to Windows 2022 and many more CPUs than before. Older servers with the same version of NSClient are still fine.

Userlevel 5
Badge +14

hello

no I have not found a solution, the bug is opened on the community source for “nscp” which is the origin for nsclient

I don’t have any way to correct this issue, so for now I’m ignoring it, also I don’t have many big windows server, mostly physical esxi. 

The issue is opened here :

check_cpu return abnormal (negative billions) percentage · Issue #780 · mickem/nscp · GitHub

if you know someone that know C/C++ and have enough expertise...

Badge +5

hello

no I have not found a solution, the bug is opened on the community source for “nscp” which is the origin for nsclient

I don’t have any way to correct this issue, so for now I’m ignoring it, also I don’t have many big windows server, mostly physical esxi. 

The issue is opened here :

check_cpu return abnormal (negative billions) percentage · Issue #780 · mickem/nscp · GitHub

if you know someone that know C/C++ and have enough expertise...

Thanks, looks like I will have to submit a support ticket directly as Centreon often ignore Github and have in the past even ignored fixes in the production releases.

Userlevel 5
Badge +14

centreon is not the developper of nsclient, neither nscp, it’s an agent they use.

 

centreon did develop a plugin using WMI without third party agent, i have not tried it yet

Badge +5

centreon is not the developper of nsclient, neither nscp, it’s an agent they use.

 

centreon did develop a plugin using WMI without third party agent, i have not tried it yet

I know that. but they have their own fork of it which should be maintained:
GitHub - centreon/centreon-nsclient-build: Source use to build the centreon NSClient agent

WMI is basically deprecated because MS no longer supports WMIC, so it’s either use WSMAN and the nightmare of maintaining SSL certificates for WinRM on all your servers, or using the NSClient++ agent. There is SNMP if you’re allowed to be insecure with only v2, as v3 is only available as a 3rd party agent if you can be bothered.

Userlevel 1
Badge +6

Hi @christophe.niel-ACT , thanks for your interest in Centreon.

We consider to benchmark SNClient+ (https://github.com/ConSol-Monitoring/snclient) in order to replace NSClient, which is deprecated.

Did you already try SNClient+ for your needs ?

 

Best regards

Reply