Skip to main content

Hi,

After upgrading from 24.04.2 to 24.04.4,
I have sql errors in the log file /var/log/centreon-broker/central-broker-master.log :

 

>sql] critical] sql: could not find index for service(x, y) - maybe the poller with that service should be restarted

 

I looked in the service table in the centreon database and the service table in the centreon_storage database, but I couldn't find a service with id x.

 

The problem is that the log files are growing and saturating the /var/log/ directory.

 

i did an analysis and optimization on the tables but the error is still there.

mysqlcheck --auto-repair --check --all-databases
mysqlcheck --optimize --all-databases

 

Same thing after rebuilding the RRD databases.

 

I only have a central server and no poller. Of course, I tried restarting the centreon services and the vm. 

 

Do you have any ideas on how to solve this problem?

HI @AntoninMHBSI can you check which poller is monitoring the host with ID = 20?

Then generate and export configuration of this poller.


The host with id 20 is OS-Windows-SNMP-custom…

MariaDB Bcentreon]> select * from host where host_id = 20 \G
*************************** 1. row ***************************
                          host_id: 20
       host_template_model_htm_id: NULL
               command_command_id: NULL
          command_command_id_arg1: NULL
                 timeperiod_tp_id: 1
                timeperiod_tp_id2: 1
              command_command_id2: NULL
          command_command_id_arg2: NULL
                        host_name: OS-Windows-SNMP-custom
                       host_alias: Template to check Windows server using SNMP protocol
                     host_address: NULL
                     display_name: NULL
          host_max_check_attempts: 5
              host_check_interval: 1
        host_retry_check_interval: 1
       host_active_checks_enabled: 2
      host_passive_checks_enabled: 2
              host_checks_enabled: 2
                    initial_state: NULL
            host_obsess_over_host: 2
             host_check_freshness: 2
         host_freshness_threshold: NULL
       host_event_handler_enabled: 2
          host_low_flap_threshold: NULL
         host_high_flap_threshold: NULL
      host_flap_detection_enabled: 2
           flap_detection_options: NULL
           host_process_perf_data: 2
   host_retain_status_information: 2
host_retain_nonstatus_information: 2
       host_notification_interval: 0
 host_recovery_notification_delay: NULL
        host_notification_options: d,u,r
       host_notifications_enabled: 1
     contact_additive_inheritance: 0
          cg_additive_inheritance: 0
    host_first_notification_delay: NULL
     host_acknowledgement_timeout: NULL
            host_stalking_options: NULL
              host_snmp_community: NULL
                host_snmp_version: NULL
                    host_location: NULL
                     host_comment: NULL
                       geo_coords: NULL
                      host_locked: 0
                    host_register: 0
                    host_activate: 1
1 row in set (0,000 sec)

I don't know if it's normal that the error is on a template. Am I looking in the right place?


Did you try to generate / export / restart your poller?


Yes, I tried exporting the central server configuration from the configuration/poller menu and restarting the poller by restarting the services, but to no avail. 
In your message you specify the host with id 20 but according to the error message it's not rather id 240 that I should check?

The id 240 corresponds to a "real" host.


I found a solution.

Looking in the centreon_storage database, I noticed that in the hosts table, I had 2 rows for the same host and that the id of one of its rows matched the id of the host in the error.

 

This command allowed me to find the name of the host that was in the error:
select * from hosts where host_id = 'x';

 

This checks the number of hosts attached to the host name:
select * from hosts where name = 'VMxxx01';

 

In the services table, I only had rows for the host id that wasn't in the error. I concluded that the id of this host is the id of the real object, while the other is the id of a ghost host.

 

select * from services where host_id = x;

 

In the resources table, I had services that were linked to the ghost host but with a disabled status.

 

select * from resources where parent_id = x;

 

I deleted the rows linked to the ghost host.

 

delete from resources where parent_id = x;


delete from hosts where host_id = 240;

 

After restarting the cbd service, the error disappeared.

Hope this helps.

 


Thanks @AntoninMHBSI for sharing your solution. Your help is much appreciated!


Reply