Skip to main content

Hello Centreon Community/Support,

I am experiencing an issue with a custom Lua stream connector script that is designed to send monitoring events to an external system (BEM). The script functions correctly on one Centreon Poller/Central server, but on a different host (which I'll refer to as "Problem Host"), events are consistently being dropped at the filter stage, resulting in the debug log message filter' returned false.

I have confirmed that the target system is listening on the expected port (2540) and that the "Problem Host" can establish a TCP connection to it.

Problem Description:

On a specific Centreon host (the "Problem Host"), the Centreon Broker Lua stream connector is not processing events as expected. The debug logs show that the filter function in the Lua script is rejecting all incoming service status and host status events, even though the same script works perfectly on another Centreon host. This prevents any alarm messages from being sent to the BEM system.

Observed Debug Log Entries (from /var/log/centreon-broker/central-bmc-master.log on the Problem Host):

[2025-07-03T17:35:46.607+02:00] [lua] [debug] lua: `filter' returned false
[2025-07-03T17:35:46.607+02:00] [lua] [debug] lua: processing service status (59, 412)
[2025-07-03T17:35:46.607+02:00] [lua] [debug] lua: luabinding::write call
[2025-07-03T17:35:46.607+02:00] [lua] [debug] lua: processing service status (59, 412)
[2025-07-03T17:35:46.607+02:00] [lua] [debug] lua: `filter' returned false
... (repeated for various service and host status events) ...
[2025-07-03T17:35:46.608+02:00] [lua] [debug] lua: processing host status (42)
[2025-07-03T17:35:46.608+02:00] [lua] [debug] lua: luabinding::write call
[2025-07-03T17:35:46.608+02:00] [lua] [debug] lua: processing host status (42)
[2025-07-03T17:35:46.608+02:00] [lua] [debug] lua: `filter' returned false
...

Understanding the Logs:

  • [lua] [debug] lua: processing service status (59, 412): This indicates Centreon Broker is passing a service status event to the Lua script. 59 is likely the host ID and 412 the service ID.

  • [lua] [debug] lua: luabinding::write call: This is highly contradictory. If filter truly returned false, the write function should not be called. This suggests either a very confusing logging order or an unexpected internal behavior of Centreon Broker's Lua binding where write is called regardless of the filter's return in some specific scenarios, or that the filter log is from a different stage than the write call.

  • [lua] [debug] lua: filter' returned false: This is the core of the problem. It explicitly states that the filter` function in the Lua script is rejecting the event, preventing further processing in the Lua script itself.

Lua Script (/usr/share/centreon-broker/lua/bbdo2bem.lua):

Lua

 

-- Configuration variables
-- These values are overridden by parameters passed from the Broker JSON configuration
local tcp_server_address = "default.server.ip"
local tcp_server_port = 0
local max_buffer_size = 5000
local max_buffer_age = 5

-- Global variables (initialized by Centreon Broker)
local socket = require("socket")
local json = require("json")
local broker_log = broker_log -- Provided by Centreon Broker for logging

-- Event queue for buffering
local event_queue = {}
event_queue.events = {}
event_queue.last_flush_time = os.time()

-- Downtime tracking (populated by broker_downtime_event events)
local downtimes = {}

function event_queue:add(alarm)
broker_log:debug(1, "lua: event_queue:add called for service_id: " .. tostring(alarmm'service_id']) .. ", state: " .. tostring(alarmm'state']))

local alarmMessage = ""

-- Handle host status events
if alarmm'element'] == 25 then -- broker_host_status
if (alarmm'state'] ~= 0 and alarmm'last_check'] == alarmm'last_hard_state_change']) then
alarmMessage = string.format('{"event_type": "Host", "host_name": "%s", "state": "%s", "output": "%s", "last_check": %s, "last_hard_state_change": %s}',
alarmm'host_name'], alarmm'state'], alarmm'output'], alarmm'last_check'], alarmm'last_hard_state_change'])
else
broker_log:debug(1, "lua: Ignoring host event (not hard state change or state OK/UP): " .. tostring(alarmm'host_name']) .. " - State: " .. tostring(alarmm'state']) .. " - Last Check: " .. tostring(alarmm'last_check']) .. " - Last Hard State Change: " .. tostring(alarmm'last_hard_state_change']))
return
end
-- Handle service status events
elseif alarmm'element'] == 24 then -- broker_service_status
-- The original condition `if (alarmm'state'] ~= 0 and alarmm'state'] ~= 1 and alarmm'state'] ~= 2)`
-- would filter out OK, WARNING, and CRITICAL states. This seems counter-intuitive.
-- Assuming the intent is to send non-OK states (WARNING, CRITICAL, UNKNOWN) if they are hard state changes.
-- Re-evaluating the filtering for `alarmm'state']` based on standard monitoring practice:
-- 0: OK, 1: WARNING, 2: CRITICAL, 3: UNKNOWN

-- Filter out OK (0) states, and only consider hard state changes for others.
if (alarmm'state'] ~= 0 and alarmm'last_check'] == alarmm'last_hard_state_change']) then
alarmMessage = string.format('{"event_type": "Service", "host_name": "%s", "service_description": "%s", "state": "%s", "output": "%s", "last_check": %s, "last_hard_state_change": %s}',
alarmm'host_name'], alarmm'service_description'], alarmm'state'], alarmm'output'], alarmm'last_check'], alarmm'last_hard_state_change'])
else
broker_log:debug(1, "lua: Ignoring service event (not hard state change or OK state): " .. tostring(alarmm'host_name']) .. "/" .. tostring(alarmm'service_description']) .. " - State: " .. tostring(alarmm'state']) .. " - Last Check: " .. tostring(alarmm'last_check']) .. " - Last Hard State Change: " .. tostring(alarmm'last_hard_state_change']))
return
end
else
broker_log:debug(1, "lua: event_queue:add: Unknown element type or event ignored: " .. tostring(alarmm'element']))
return
end

if #alarmMessage > 0 then
event_queue.eventss#event_queue.events + 1] = alarmMessage .. "\n"
broker_log:debug(1, "lua: Added event to buffer. Current buffer size: " .. #event_queue.events)
end

local current_time = os.time()
if (#event_queue.events >= max_buffer_size) or (current_time - event_queue.last_flush_time >= max_buffer_age) then
broker_log:debug(1, "lua: Flushing events. Buffer size: " .. #event_queue.events .. ", Age: " .. (current_time - event_queue.last_flush_time) .. " seconds.")
self:flush()
end
end

function event_queue:flush()
if #self.events == 0 then
broker_log:debug(1, "lua: Flush called but event buffer is empty.")
self.last_flush_time = os.time()
return
end

local data_stack = table.concat(self.events)
self.events = {}
self.last_flush_time = os.time()

local sock = socket.tcp()
sock:settimeout(5)

local status, err = sock:connect(tcp_server_address, tcp_server_port)
if not status then
broker_log:error(1, "lua: Failed to connect to BEM server (" .. tostring(tcp_server_address) .. ":" .. tostring(tcp_server_port) .. "): " .. tostring(err))
return
end

local sent, err = sock:send(data_stack)
if not sent then
broker_log:error(1, "lua: Failed to send data to BEM server: " .. tostring(err))
else
broker_log:info(1, "lua: Successfully sent " .. tostring(sent) .. " bytes to BEM server.")
end

sock:close()
end

-- This function is called by Centreon Broker for every event.
-- It decides whether the event should be processed by the 'write' function.
function filter(category, element)
broker_log:debug(1, "lua: `filter' called with category: " .. tostring(category) .. ", element: " .. tostring(element))

-- category == 1 is broker_category_log (standard monitoring events, also known as NEB)
-- As per Centreon documentation, 'neb' category refers to category 1.
-- element 24 is broker_service_status
-- element 25 is broker_host_status
-- element 5 is broker_downtime (for tracking scheduled downtimes)
if category == 1 and (element == 24 or element == 25 or element == 5) then
broker_log:debug(1, "lua: `filter' returned true for category 1 (NEB) and element " .. tostring(element))
return true
end
broker_log:debug(1, "lua: `filter' returned false for category " .. tostring(category) .. " and element " .. tostring(element))
return false
end

-- This function is called by Centreon Broker for events that pass the filter.
function write(event)
broker_log:debug(1, "lua: luabinding::write call (event element: " .. tostring(eventt'element']) .. ")")

-- Handle downtime events separately
if eventt'element'] == 5 then -- broker_downtime
broker_log:debug(1, "lua: Processing downtime event for host_id: " .. tostring(eventt'host_id']) .. ", service_id: " .. tostring(eventt'service_id']) .. ", type: " .. tostring(eventt'downtime_type']) .. ", entry_type: " .. tostring(eventt'entry_type']))
if eventt'entry_type'] == 2 then -- 2 means downtime started
if eventt'service_id'] ~= 0 then
downtimesseventt'service_id']] = event
broker_log:debug(1, "lua: Service " .. tostring(eventt'service_id']) .. " added to downtimes.")
else
-- Host downtime, needs a different tracking mechanism or apply to all services of host
-- For simplicity, this example only tracks service downtimes directly
broker_log:debug(1, "lua: Host " .. tostring(eventt'host_id']) .. " downtime started (not yet fully implemented for services of host).")
end
elseif eventt'entry_type'] == 3 then -- 3 means downtime ended (cancelled or expired)
if eventt'service_id'] ~= 0 then
downtimesseventt'service_id']] = nil
broker_log:debug(1, "lua: Service " .. tostring(eventt'service_id']) .. " removed from downtimes.")
else
-- Host downtime ended
broker_log:debug(1, "lua: Host " .. tostring(eventt'host_id']) .. " downtime ended (not yet fully implemented for services of host).")
end
end
return true
end

-- For service/host status events, check if in downtime
local is_in_downtime = false
if eventt'element'] == 24 then -- Service status
local downtime_event = downtimesseventt'service_id']]
if downtime_event then
-- Check if the event's last_check time falls within the downtime window
if eventt'last_check'] >= downtime_eventt'start_time'] and eventt'last_check'] <= downtime_eventt'end_time'] then
is_in_downtime = true
broker_log:debug(1, "lua: Service " .. tostring(eventt'service_id']) .. " is in scheduled downtime. Ignoring event.")
end
end
-- You might want to implement similar downtime logic for host status (element 25) here as well
-- For now, host status events are not checked against scheduled downtimes in this specific logic.
end

if not is_in_downtime then
event_queue:add(event)
end

return true -- Indicate that the event was processed by the write function
end

-- Initialization function (called once when Broker starts)
function init(conf)
-- Overwrite default configuration variables with values from JSON config
if conf and conf.tcp_server_address then
tcp_server_address = conf.tcp_server_address
end
if conf and conf.tcp_server_port then
tcp_server_port = tonumber(conf.tcp_server_port) -- Ensure it's a number
end
if conf and conf.max_buffer_size then
max_buffer_size = tonumber(conf.max_buffer_size)
end
if conf and conf.max_buffer_age then
max_buffer_age = tonumber(conf.max_buffer_age)
end

broker_log:set_parameters(1, "/var/log/centreon-broker/lua_bem_debug.log") -- Log to a separate file for easier debugging
broker_log:info(1, "lua: Stream connector initialized with config: " .. json.encode(conf))
broker_log:debug(1, "lua: TCP Server Address: " .. tostring(tcp_server_address) .. ", Port: " .. tostring(tcp_server_port))
end

JSON Configuration Files:

Here are the two relevant Centreon Broker JSON configuration files:

1. central-bmc-master1.json (This is the configuration for the "Problem Host" where the Lua script is failing):

JSON

 

{
"centreonBroker": {
"broker_id": 12,
"broker_name": "central-bmc-master1",
"poller_id": 1,
"poller_name": "Central",
"module_directory": "/usr/share/centreon/lib/centreon-broker",
"log_timestamp": true,
"log_thread_id": true,
"event_queue_max_size": 100000,
"command_file": "",
"cache_directory": "/var/lib/centreon-broker/",
"bbdo_version": "3.0.1",
"log": {
"directory": "/var/log/centreon-broker/",
"filename": "central-bmc-master.log",
"max_size": 0,
"loggers": {
"core": "info",
"config": "error",
"sql": "error",
"processing": "error",
"perfdata": "error",
"bbdo": "error",
"tcp": "debug",
"tls": "error",
"lua": "debug", // Set to debug for verbose Lua logging
"bam": "error",
"neb": "error",
"rrd": "error",
"grpc": "error",
"influxdb": "error",
"graphite": "error",
"victoria_metrics": "error",
"stats": "error"
}
},
"input": :
{
"type": "ipv4",
"name": "central-bmc-master-input",
"port": "5671",
"tls": "no",
"protocol": "bbdo",
"negotiation": "yes",
"one_peer_retention_mode": "no"
}
],
"output": :
{
"type": "lua",
"name": "central-bmc-master-output",
"path": "/usr/share/centreon-broker/lua/bbdo2bem.lua",
"filters": {
"category": :
"neb"
]
},
"lua_parameter": :
{
"name": "log_file",
"value": "/usr/share/centreon-broker/lua/debug.log\"",
"type": "string"
},
{
"name": "log_level",
"value": "debug",
"type": "string"
},
{
"name": "tcp_server_address",
"value": "XXXXXXXXXXX",
"type": "string"
},
{
"name": "tcp_server_port",
"value": "2540",
"type": "string"
}
]
}
],
"stats": :
{
"type": "stats",
"name": "central-bmc-master1-stats",
"json_fifo": "/var/lib/centreon-broker//central-bmc-master1-stats.json"
}
],
"grpc": {
"port": 51012
}
}
}

2. central-broker.json (This is the configuration for the main Central Broker, which sends data to central-bmc-master1.json):

JSON

 

{
"centreonBroker": {
"broker_id": 1,
"broker_name": "central-broker-master",
"poller_id": 1,
"poller_name": "Central",
"module_directory": "/usr/share/centreon/lib/centreon-broker",
"log_timestamp": true,
"log_thread_id": false,
"event_queue_max_size": 100000,
"command_file": "/var/lib/centreon-broker/command.sock",
"cache_directory": "/var/lib/centreon-broker",
"bbdo_version": "3.0.1",
"log": {
"directory": "/var/log/centreon-broker/",
"filename": "",
"max_size": 0,
"loggers": {
"core": "info",
"config": "error",
"sql": "error",
"processing": "error",
"perfdata": "error",
"bbdo": "error",
"tcp": "error",
"tls": "error",
"lua": "error", // Note: This is 'error' on the main broker, but 'debug' on central-bmc-master1
"bam": "error",
"neb": "error",
"rrd": "error",
"grpc": "error",
"influxdb": "error",
"graphite": "error",
"victoria_metrics": "error",
"stats": "error"
}
},
"input": "
{
"type": "ipv4",
"name": "central-broker-master-input",
"port": "5669",
"tls": "auto",
"protocol": "bbdo",
"negotiation": "yes",
"one_peer_retention_mode": "no"
}
],
"output": "
{
"type": "unified_sql",
"name": "central-broker-master-unified-sql",
"db_host": "localhost",
"db_user": "centreon",
"db_password": "Centreon123!",
"db_name": "centreon_storage",
"interval": "60",
"length": "15552000",
"db_port": "3306",
"check_replication": "no",
"store_in_data_bin": "yes",
"insert_in_index_data": "1",
"db_type": "mysql"
},
{
"type": "ipv4",
"name": "centreon-broker-master-rrd",
"port": "5670",
"host": "localhost",
"tls": "no",
"protocol": "bbdo",
"negotiation": "yes",
"one_peer_retention_mode": "no"
},
{
"type": "ipv4",
"name": "centreon-broker-master-bmc",
"port": "5671",
"host": "localhost",
"tls": "no",
"protocol": "bbdo",
"negotiation": "yes",
"one_peer_retention_mode": "no"
}
],
"stats": "
{
"type": "stats",
"name": "central-broker-master-stats",
"json_fifo": "/var/lib/centreon-broker/central-broker-master-stats.json"
}
],
"grpc": {
"port": 51001
}
}
}

How the Setup Works (or Should Work):

  1. central-broker.json (Main Broker): This instance (broker_id 1) is the primary broker on your Central server. It receives events from Centreon Engine (likely via an internal module, though not explicitly shown as an input here, it typically reads from a command pipe or shared memory).

    • It then outputs these events to:

      • unified_sql: For storing monitoring data in the centreon_storage database.

      • centreon-broker-master-rrd: Another BBDO output, likely to a dedicated RRD broker instance for graph generation (listening on port 5670).

      • centreon-broker-master-bmc: Crucially, this is an IPv4 BBDO output sending data to localhost:5671. This is the input for central-bmc-master1.json.

  2. central-bmc-master1.json (BEM Broker / Problem Host's Config): This instance (broker_id 12), despite being on the "Central" poller (poller_id 1), is acting as a dedicated broker for BEM integration.

    • Its central-bmc-master-input on port: 5671 is designed to receive the BBDO stream from central-broker.json.

    • Its central-bmc-master-output is the type: "lua" stream connector using bbdo2bem.lua to send data to your BEM server (xxxxx:2540).

Network Connectivity & Listener Check:

On the "Problem Host" (the Centreon Central server where central-bmc-master1 is running), I have verified that:

  • The BEM server (infra-tcp2-zo.itn.ftgroup) is listening on port 2540.

    • netstat -tulnp | grep 2540 (on the BEM server):

      droot@centreon centreon-broker]# ss -tnap | grep :2540
      ESTAB 0 0 myip:51352 bemIP:2540 users:(("sleep",pid=401022,fd=11),("cbd",pid=400780,fd=11))
      root@centreon centreon-broker]#
  • The "Problem Host" can establish a TCP connection to the BEM server on port 2540.

    • telnet bem ip 2540 (from the Problem Host):

      Trying bem ip...
      Connected to bem ip.
      Escape character is '^]'.
  • Crucially, it is also confirmed that central-bmc-master1 is receiving data from central-broker-master on port 5671. This means the initial BBDO connection between the two Broker instances is working.

The Core Issue - filter returns false:

Despite the connection between brokers working, and the Lua script being correctly loaded, the filter(category, element) function in bbdo2bem.lua explicitly logs that it is returning false for service status (element 24) and host status (element 25) events.

My filter function's logic is:

Lua

 

function filter(category, element)
broker_log:debug(1, "lua: `filter' called with category: " .. tostring(category) .. ", element: " .. tostring(element))
if category == 1 and (element == 24 or element == 25 or element == 5) then
broker_log:debug(1, "lua: `filter' returned true for category 1 (NEB) and element " .. tostring(element))
return true
end
broker_log:debug(1, "lua: `filter' returned false for category " .. tostring(category) .. " and element " .. tostring(element))
return false
end

Given the logs show category: 1 and element: 24 or 25, the filter function should be returning true. The fact that it's returning false is the root of the problem.

Possible Causes / Questions for the Community:

  1. Broker Version & BBDO Protocol Discrepancy: Both central-broker.json and central-bmc-master1.json explicitly state bbdo_version": "3.0.1".

    • Is it possible that the events being sent by central-broker-master (which is itself receiving events from Centreon Engine) are using an older BBDO event structure or a different Protobuf encoding than what the Lua interpreter in central-bmc-master1 expects, causing category and element to be misinterpreted within the filter function?

    • Centreon documentation mentions a migration from legacy BBDO events (v2) to Protobuf (v3). If the Centreon Engine or the initial Broker instance on the "Problem Host" is still generating legacy events, while the Lua binding expects Pb events, this could cause the filter logic to fail.

    • Question: What are the exact Centreon Platform and Centreon Broker versions on both the working host and the "Problem Host"? (e.g., rpm -qa | grep centreon for Centreon components and cbd --version for Broker). This is critical for assessing BBDO compatibility.

    • Question: Could the filters section in the central-bmc-master1.json output configuration (specifically "category": y"neb"]) be implicitly altering how the event data is presented to the Lua script, or clashing with the internal filter function's logic? Typically, if the filters array is used, events not matching those filters are dropped before the Lua filter function is even called. However, if it's passing, then the Lua filter should be the final arbiter.

  2. Lua Interpreter Environment: Could there be subtle differences in the Lua runtime environment (e.g., Lua 5.1 vs 5.3, or specific library versions) between the working Centreon host and the "Problem Host" that cause the filter function to misbehave?

    • Question: Are there any recommended checks for the Lua environment specific to Centreon Broker stream connectors?

  3. Contradictory Log Messages: The logging sequence filter' returned false followed by luabinding::write call is highly confusing.

    • Question: Is this an expected debug logging behavior in some Centreon Broker versions when a filter rejects an event, but write is still internally called for some cleanup or fallback? Or does it definitively point to a bug or misconfiguration where the filter's return value is being ignored or misinterpreted by the Broker's core logic?

  4. Event Data Corruption/Malformation: While less likely if central-broker-master is successfully sending events, is it conceivable that the events arriving at central-bmc-master1 are somehow malformed or corrupted in a way that category and element are not correctly parsed by the Lua binding before being passed to filter?

    • Question: Is there a way to dump the raw event data as it enters the Lua stream connector for inspection on the "Problem Host"?

Any insights or further debugging steps from the Centreon team or community would be immensely helpful. Thank you for your time and assistance.

Be the first to reply!

Reply