Skip to main content

Hello everyone,

We are currently using the following plugin to retrieve all AWS CloudWatch alarms:

/usr/lib/centreon/plugins/centreon_aws_cloudwatch_api.pl --plugin=cloud::aws::cloudwatch::plugin --mode=get-alarms --custommode='awscli' --aws-secret-key='***' --aws-access-key='***' --aws-role-arn='' --proxyurl='' --region='sa-east-1' --filter-alarm-name='' --warning-status='%{state_value} =~ /INSUFFICIENT_DATA/i' --critical-status='%{state_value} =~ /ALARM/i' --verbose

We have an active alarm, which can be confirmed when using the --debug option. The output indicates that the instance memory has <StateValue>ALARM</StateValue>. However, the final result is:

ok: status : skipped (no value(s))

My question is: could this be a potential bug in the plugin?

When running the native AWS CLI command:

aws cloudwatch describe-alarms --region sa-east-1 --state-value ALARM

all active alarms are correctly returned, while the plugin does not display them.

Below is the --debug output showing the alarm:

   

  <member>
        <TreatMissingData>breaching</TreatMissingData>
        <AlarmConfigurationUpdatedTimestamp>2025-09-01T18:58:33.193Z</AlarmConfigurationUpdatedTimestamp>
        <StateValue>ALARM</StateValue>
        <Threshold>40.0</Threshold>
        <StateReason>Threshold Crossed: no datapoints were received for 1 period and 1 missing datapoint was treated as tBreaching].</StateReason>
        <InsufficientDataActions/>
        <StateTransitionedTimestamp>2025-09-02T04:39:15.668Z</StateTransitionedTimestamp>
        <AlarmActions/>
        <StateUpdatedTimestamp>2025-09-02T04:39:15.668Z</StateUpdatedTimestamp>
        <ComparisonOperator>GreaterThanThreshold</ComparisonOperator>
        <AlarmName>Instance memory</AlarmName>
        <EvaluationPeriods>1</EvaluationPeriods>
        <StateReasonData>{"version&quot;:&quot;1.0&quot;,&quot;queryDate&quot;:&quot;2025-09-02T04:39:15.664+0000&quot;,&quot;period&quot;:300,&quot;recentDatapoints&quot;:i],&quot;threshold&quot;:40.0,&quot;evaluatedDatapoints&quot;:a{&quot;timestamp&quot;:&quot;2025-09-02T04:34:00.000+0000&quot;}]}</StateReasonData>
        <ActionsEnabled>true</ActionsEnabled>
        <DatapointsToAlarm>1</DatapointsToAlarm>
        <Metrics>
          <member>
            <Expression>MAX(e1)</Expression>
            <ReturnData>true</ReturnData>
            <Label>Instance memory usage</Label>
            <Id>e2</Id>
          </member>
          <member>
            <Period>300</Period>
            <Expression>SELECT AVG(mem_used_percent) FROM SCHEMA(CWAgent,InstanceId) GROUP BY InstanceId ORDER BY AVG() DESC</Expression>
            <ReturnData>false</ReturnData>
            <Label>Instance</Label>
            <Id>e1</Id>
          </member>
        </Metrics>
        <CreationId>87b60bf6-d984-4612-8e58-c77ed25e9809/1751060096124</CreationId>
        <OKActions/>
        <AlarmArn>arn:aws:cloudwatch:sa-east-1:074071149174:alarm:Instance memory</AlarmArn>
        <Dimensions/>
      </member>
    </MetricAlarms>
  </DescribeAlarmsResult>
  <ResponseMetadata>
    <RequestId>af6003e5-476b-400c-a06f-77ce382f0c99</RequestId>
  </ResponseMetadata>
</DescribeAlarmsResponse>

2025-09-02 09:44:31,070 - MainThread - botocore.hooks - DEBUG - Event needs-retry.cloudwatch.DescribeAlarms: calling handler <botocore.retryhandler.RetryHandler object at 0x7f8741f3e650>
2025-09-02 09:44:31,071 - MainThread - botocore.retryhandler - DEBUG - No retry needed.
OK: 0 problem(s) detected | 'alerts'=0;;;0;
Command line: 'aws cloudwatch describe-alarms --region sa-east-1 --output json --debug'
ok: status : skipped (no value(s))
ok: status : skipped (no value(s))

 

Hello ​@raphael.loureiro 

 

Could you try to run the command from the plugin debug and send us the output from the /tmp/outputAws file ?

aws cloudwatch describe-alarms --region sa-east-1 --output json >> /tmp/outputAws

This will allows us to run the plugin locally to see what happening.

Regards,


{
"CompositeAlarms": [],
"MetricAlarms": [
{
"EvaluationPeriods": 1,
"TreatMissingData": "breaching",
"AlarmArn": "arn:aws:cloudwatch:sa-east-1:074071149174:alarm:Instance StatusCheckFailed",
"StateUpdatedTimestamp": "2025-08-29T16:00:06.722Z",
"AlarmConfigurationUpdatedTimestamp": "2025-06-27T22:05:20.591Z",
"ComparisonOperator": "GreaterThanThreshold",
"AlarmActions": [],
"StateReasonData": "{\"version\":\"1.0\",\"queryDate\":\"2025-08-29T16:00:06.719+0000\",\"startDate\":\"2025-08-29T15:55:00.000+0000\",\"period\":300,\"recentDatapoints\":[0.0],\"threshold\":0.0,\"evaluatedDatapoints\":[{\"timestamp\":\"2025-08-29T15:55:00.000+0000\",\"value\":0.0}]}",
"StateValue": "OK",
"Metrics": [
{
"ReturnData": true,
"Expression": "AVG(e1)",
"Id": "e2",
"Label": "Instance StatusCheckFailed"
},
{
"ReturnData": false,
"Period": 300,
"Expression": "SELECT AVG(StatusCheckFailed_Instance) FROM \"AWS/EC2\" GROUP BY InstanceId",
"Id": "e1",
"Label": "Instance"
}
],
"Threshold": 0.0,
"AlarmName": "Instance StatusCheckFailed",
"DatapointsToAlarm": 1,
"StateReason": "Threshold Crossed: 1 out of the last 1 datapoints [0.0 (29/08/25 15:55:00)] was not greater than the threshold (0.0) (minimum 1 datapoint for ALARM -> OK transition).",
"InsufficientDataActions": [],
"OKActions": [],
"ActionsEnabled": true,
"Dimensions": []
},
{
"EvaluationPeriods": 1,
"TreatMissingData": "breaching",
"AlarmArn": "arn:aws:cloudwatch:sa-east-1:074071149174:alarm:Instance System StatusCheckFailed",
"StateUpdatedTimestamp": "2025-07-08T16:03:29.125Z",
"AlarmConfigurationUpdatedTimestamp": "2025-07-04T13:59:54.353Z",
"ComparisonOperator": "GreaterThanThreshold",
"AlarmActions": [],
"StateReasonData": "{\"version\":\"1.0\",\"queryDate\":\"2025-07-08T16:03:29.123+0000\",\"startDate\":\"2025-07-08T15:58:00.000+0000\",\"period\":300,\"recentDatapoints\":[0.0],\"threshold\":0.0,\"evaluatedDatapoints\":[{\"timestamp\":\"2025-07-08T15:58:00.000+0000\",\"value\":0.0}]}",
"StateValue": "OK",
"Metrics": [
{
"ReturnData": true,
"Expression": "AVG(e1)",
"Id": "e2",
"Label": "System StatusCheckFailed"
},
{
"ReturnData": false,
"Period": 300,
"Expression": "SELECT AVG(StatusCheckFailed_System) FROM \"AWS/EC2\" GROUP BY InstanceId",
"Id": "e1",
"Label": "AVG System StatusCheckFailed"
}
],
"Threshold": 0.0,
"AlarmName": "Instance System StatusCheckFailed",
"DatapointsToAlarm": 1,
"StateReason": "Threshold Crossed: 1 out of the last 1 datapoints [0.0 (08/07/25 15:58:00)] was not greater than the threshold (0.0) (minimum 1 datapoint for ALARM -> OK transition).",
"InsufficientDataActions": [],
"OKActions": [],
"ActionsEnabled": true,
"Dimensions": []
},
{
"EvaluationPeriods": 1,
"TreatMissingData": "breaching",
"AlarmArn": "arn:aws:cloudwatch:sa-east-1:074071149174:alarm:Instance memory",
"StateUpdatedTimestamp": "2025-09-06T05:13:23.724Z",
"AlarmConfigurationUpdatedTimestamp": "2025-09-02T15:08:20.105Z",
"ComparisonOperator": "GreaterThanThreshold",
"AlarmActions": [],
"StateReasonData": "{\"version\":\"1.0\",\"queryDate\":\"2025-09-06T05:13:23.722+0000\",\"period\":300,\"recentDatapoints\":[],\"threshold\":40.0,\"evaluatedDatapoints\":[{\"timestamp\":\"2025-09-06T05:08:00.000+0000\"}]}",
"StateValue": "ALARM",
"Metrics": [
{
"ReturnData": true,
"Expression": "MAX(e1)",
"Id": "e2",
"Label": "Instance memory usage"
},
{
"ReturnData": false,
"Period": 300,
"Expression": "SELECT AVG(mem_used_percent) FROM SCHEMA(CWAgent,InstanceId) GROUP BY InstanceId ORDER BY AVG() DESC",
"Id": "e1",
"Label": "Instance"
}
],
"Threshold": 40.0,
"AlarmName": "Instance memory",
"DatapointsToAlarm": 1,
"StateReason": "Threshold Crossed: no datapoints were received for 1 period and 1 missing datapoint was treated as [Breaching].",
"InsufficientDataActions": [],
"OKActions": [],
"ActionsEnabled": true,
"Dimensions": []
}
]
}

 


Above is the response of the aws command for verification, thanks for replying.


Reply