Skip to main content

Hello everyone.

We are currently facing an issue with the Azure Virtual machine plugin.

 

We monitor 13 VM, and every time, at least 3 of them appears (randomly and it can be 4-5-6)  as DOWN in our GUI.

 

While the rest of the service associated to the VM works like a charm.

 

 

 

When launching the following command from the poller with the --debug argument I get the following :

== Info: TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
== Info: Connection state changed (MAX_CONCURRENT_STREAMS == 100)!
=> Recv header: HTTP/2 200
=> Recv header: cache-control: no-cache
=> Recv header: pragma: no-cache
=> Recv header: content-length: 663
=> Recv header: content-type: application/json; charset=utf-8
=> Recv header: expires: -1
=> Recv header: strict-transport-security: max-age=31536000; includeSubDomains
=> Recv header: x-content-type-options: nosniff
=> Recv header: x-ms-ratelimit-remaining-subscription-resource-requests: 97
=> Recv header: x-ms-request-id: 142788ce-1818-4a30-9503-2bbc3e53260b
=> Recv header: x-ms-correlation-request-id: 142788ce-1818-4a30-9503-2bbc3e53260b
=> Recv header: x-ms-routing-request-id: FRANCESOUTH:20240909T081955Z:142788ce-1818-4a30-9503-2bbc3e53260b
=> Recv header: x-cache: CONFIG_NOCACHE
=> Recv header: x-msedge-ref: Ref A: 24ADA54643B541E9A28D56FB485E1783 Ref B: MRS211050313023 Ref C: 2024-09-09T08:19:54Z
=> Recv header: date: Mon, 09 Sep 2024 08:19:55 GMT
=> Recv header:
=> Recv data: {"id":"/subscriptions/86156cfa-6f0c-4ce4-8950-3510c75d129f/resourcegroups/rg-gpsweb-nonprod-006/providers/microsoft.compute/virtualmachines/vm-gpsweb-nonprod-006-01/providers/Microsoft.ResourceHealth/availabilityStatuses/current","name":"current","type":"Microsoft.ResourceHealth/AvailabilityStatuses","location":"francecentral","properties":{"availabilityState":"Unknown","title":"Unknown","summary":"We are currently unable to determine the health of this virtual machine.","reasonType":"","category":"Not Applicable","context":"Not Applicable","occuredTime":"2024-09-09T07:56:15Z","reasonChronicity":"Persistent","reportedTime":"2024-09-09T08:19:55.3097151Z"}}

 

How can I resolve this ?

Thanks,

Regards,

Hello,

Any ideas ?

Regards,


Hello,


@Fabrix Can you help us ?

Regards

 


Hi @BenjaminL,

Here are my suggestions:

  • Look for information about this status, that comes from Azure and not from our plugin (cf the debug output, the message is returned by the API).
  • If those “unknown” phases don’t last long and have no actual impact, you may increase the “Max Check Attempts” parameter on the host template to avoid having hard alerts when it doesn’t last.
  • If you don’t want to be bothered by this status any more, then you may add --ok-status=’'%{status} =~ /^(Available|Unknown)$/’ to the host template’s EXTRAOPTIONS macro but be warned that it may hide some serious incidents.

Hello,

 

Thanks a lot for your answer, I’ll take a closer look.

 

Regards,


Reply