Skip to main content

Hello

 

I’m running centreon version 2.8.26 and 2 poller running version 18.10

I know my version is old but was working.

Recently cpu load increase every day from 20h to 11h.

Top indicate cpu usage came from mysqld

From mysql show full processlist I can see following sql queries during high CPU load

This queries seems to proceed in loop

Do you what feature or configuration can cause this kind of SQL query that look very complicated

 

| 407559 | centreon | localhost | centreon_storage | Query   |   16 | Sending data | SELECT
            h.host_id,
            h.name,
            h.address,
            s.checked as has_been_checked,
            s.description as service_description,
            s.display_name,
            s.display_name as alias,
            s.notes,
            s.check_command,
            s.perfdata,
            s.output,
            s.state as current_state,
            s.last_check,
            s.next_check,
            s.state_type,
            s.check_attempt as current_check_attempt,
            s.max_check_attempts,
            s.last_state_change,
            s.last_hard_state,
            s.last_hard_state_change,
            s.acknowledged as problem_has_been_acknowledged,
            d.start_time as downtime_start,
            d.end_time as downtime_end,
            d.author as downtime_author,
            d.comment_data as downtime_data
            FROM services s
            LEFT JOIN hosts h
                ON s.host_id=h.host_id
            LEFT JOIN downtimes d
                ON (d.host_id = h.host_id AND d.service_id=s.service_id AND d.start_time < UNIX_TIMESTAMP() AND d.end_time > UNIX_TIMESTAMP() AND d.deletion_time IS NULL)
            WHERE (d.downtime_id IS NULL OR d.downtime_id IN (
                            SELECT MAX(d.downtime_id) as downtime_id
                                FROM downtimes d where d.host_id = h.host_id AND d.service_id = s.service_id AND d.start_time < UNIX_TIMESTAMP() AND d.end_time > UNIX_TIMESTAMP() AND d.deletion_time IS NULL
                            )
                  )
               AND s.host_id = h.host_id AND s.enabled = 1 AND h.enabled = 1
               

 

I have checked centreon_storage database integrity and corrected some error but issue is still there

Bellow disk usage of each table

+--------------------+----------------------------------------------+------------+
| Database           | Table                                        | Size in MB |
+--------------------+----------------------------------------------+------------+
| centreon_storage   | data_bin                                     |   15262.04 |
| centreon_storage   | logs                                         |    1425.28 |
| centreon_storage   | comments                                     |     501.19 |
| centreon_storage   | downtimes                                    |     295.47 |
| centreon_storage   | servicestateevents                           |     251.47 |
| centreon_storage   | log_archive_service                          |     150.23 |
| centreon_storage   | log_action_modification                      |      86.14 |
| centreon_storage   | log_archive_host                             |      14.45 |
| centreon_storage   | hoststateevents                              |      11.06 |
| centreon           | ods_view_details                             |       6.03 |
 


Hi @gaarby 

 

Could you paste the result of this SQL command please: 

SHOW CREATE TABLE centreon_storage.downtimes;

It’s very strange that it happens every day at the same hour. 

Could you also share the output of: 

cat /etc/cron.d/cent*

Thanks


Hi Sims

 

Thanks for your reply

Bellow show create table and cron.d/cent*

I was thinking like you about a cron entry but didn’t find anyone match windows of high CPU

 

Have a nice evening

 

Grégory

 

MariaDB a(none)]> SHOW CREATE TABLE centreon_storage.downtimes;
+-----------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table     | Create Table                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
+-----------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| downtimes | CREATE TABLE `downtimes` (
  `downtime_id` int(11) NOT NULL AUTO_INCREMENT,
  `entry_time` int(11) DEFAULT NULL,
  `host_id` int(11) NOT NULL,
  `service_id` int(11) DEFAULT NULL,
  `author` varchar(64) DEFAULT NULL,
  `cancelled` tinyint(1) DEFAULT NULL,
  `comment_data` mediumtext,
  `deletion_time` int(11) DEFAULT NULL,
  `duration` int(11) DEFAULT NULL,
  `end_time` int(11) DEFAULT NULL,
  `fixed` tinyint(1) DEFAULT NULL,
  `instance_id` int(11) DEFAULT NULL,
  `internal_id` int(11) DEFAULT NULL,
  `start_time` int(11) DEFAULT NULL,
  `actual_start_time` int(11) DEFAULT NULL,
  `actual_end_time` int(11) DEFAULT NULL,
  `started` tinyint(1) DEFAULT NULL,
  `triggered_by` int(11) DEFAULT NULL,
  `type` smallint(6) DEFAULT NULL,
  PRIMARY KEY (`downtime_id`),
  UNIQUE KEY `entry_time` (`entry_time`,`instance_id`,`internal_id`),
  KEY `host_id` (`host_id`),
  KEY `instance_id` (`instance_id`),
  KEY `entry_time_2` (`entry_time`),
  KEY `downtimeManager_hostList` (`host_id`,`start_time`),
  CONSTRAINT `downtimes_ibfk_1` FOREIGN KEY (`host_id`) REFERENCES `hosts` (`host_id`) ON DELETE CASCADE,
  CONSTRAINT `downtimes_ibfk_2` FOREIGN KEY (`instance_id`) REFERENCES `instances` (`instance_id`) ON DELETE SET NULL
) ENGINE=InnoDB AUTO_INCREMENT=1292745 DEFAULT CHARSET=utf8 |
+-----------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

MariaDB -(none)]> 
 

 

-root@centreon ~]# cat /etc/cron.d/cent*
# ########################################
#
# Cron Configuration for Centreon
#
PATH=/sbin:/bin:/usr/sbin:/usr/bin

# rewrite file with new cron line
CRONTAB_EXEC_USER=""

############################
# Cron for CentACL
* * * * * apache /usr/bin/php -q /usr/share/centreon/cron/centAcl.php >> /var/log/centreon/centAcl.log 2>&1

############################
# Cron for Centreon-Downtime
*/5 * * * * apache /usr/bin/php -q /usr/share/centreon/cron/downtimeManager.php >> /var/log/centreon/downtimeManager.log 2>&1

############################
# Cron to build state events
0 3 * * * centreon /usr/share/centreon/cron/eventReportBuilder --config=/etc/centreon/conf.pm >> /var/log/centreon/eventReportBuilder.log 2>&1

# Cron to build reporting
0 6 * * * centreon /usr/share/centreon/cron/dashboardBuilder --config=/etc/centreon/conf.pm >> /var/log/centreon/dashboardBuilder.log 2>&1

# Cron for knowledge base
*/5 * * * * centreon /usr/bin/php -q /usr/share/centreon/cron/centKnowledgeSynchronizer.php >> /var/log/centreon/knowledgebase.log 2>&1

##########################
# Cron for Centreon-Backup
30 3 * * * root /usr/share/centreon/cron/centreon-backup.pl >> /var/log/centreon/centreon-backup.log 2>&1

# ########################################
#
# Cron Configuration for Centreon
#
PATH=/sbin:/bin:/usr/sbin:/usr/bin

# rewrite file with new cron line
CRONTAB_EXEC_USER=""

# Cron for tracing Nagios Poller Performances
*/5 * * * * centreon /usr/share/centreon/cron/nagiosPerfTrace --config=/etc/centreon/conf.pm >> /var/log/centreon/nagiosPerfTrace.log 2>&1

# Cron for databin and logs purge
0 2 * * * centreon /usr/bin/php /usr/share/centreon/cron/centstorage_purge.php >> /var/log/centreon/centreon-purge.log 2>&1

# Cron for database partitioning
0 4 * * * centreon /usr/bin/php /usr/share/centreon/cron/centreon-partitioning.php >> /var/log/centreon/centreon-partitioning.log 2>&1
 

 


 

 

Hello,


My bet would be that you trigger a lot of downtimes around 6 p.m.


Then, this cron execution takes a lot of time to retrieve the list of downtimes to apply at this time: 

  • /usr/share/centreon/cron/downtimeManager.php

Could you share the content of this log: 

  • /var/log/centreon/downtimeManager.log

 

How to move forward on this … First, please share the result of the following queries: 

SELECT count(*) FROM centreon_storage.downtimes

Then, the oldest entry in the table: 

SELECT min(entry_time) FROM downtimes WHERE entry_time IS NOT NULL;

 

Also, I see some difference on the unique key between my platform and yours, maybe an interesting test would be to align the schema with latest centreon version. Here are the steps you should take: 

  • Dump your table to backup it: 
mysqldump centreon_storage downtimes > downtimes.backup.sql
  • Modify the unique key:
ALTER TABLE downtimes     
DROP INDEX entry_time,
ADD UNIQUE KEY `entry_time` (`host_id`,`service_id`,`entry_time`,`instance_id`,`internal_id`);
  • Then try to run manually the query and check if it still generates high load on the server again:
SELECT
h.host_id,
h.name,
h.address,
s.checked as has_been_checked,
s.description as service_description,
s.display_name,
s.display_name as alias,
s.notes,
s.check_command,
s.perfdata,
s.output,
s.state as current_state,
s.last_check,
s.next_check,
s.state_type,
s.check_attempt as current_check_attempt,
s.max_check_attempts,
s.last_state_change,
s.last_hard_state,
s.last_hard_state_change,
s.acknowledged as problem_has_been_acknowledged,
d.start_time as downtime_start,
d.end_time as downtime_end,
d.author as downtime_author,
d.comment_data as downtime_data
FROM services s
LEFT JOIN hosts h
ON s.host_id=h.host_id
LEFT JOIN downtimes d
ON (d.host_id = h.host_id AND d.service_id=s.service_id AND d.start_time < UNIX_TIMESTAMP() AND d.end_time > UNIX_TIMESTAMP() AND d.deletion_time IS NULL)
WHERE (d.downtime_id IS NULL OR d.downtime_id IN (
SELECT MAX(d.downtime_id) as downtime_id
FROM downtimes d where d.host_id = h.host_id AND d.service_id = s.service_id AND d.start_time < UNIX_TIMESTAMP() AND d.end_time > UNIX_TIMESTAMP() AND d.deletion_time IS NULL
)
)
AND s.host_id = h.host_id AND s.enabled = 1 AND h.enabled = 1

Kill it after one minute if you see that it takes forever and generates high load on your server. 

 

Kind regards


Hi

 

Thanks again for your great responses and time allocated to my issue!

Appreciate a lot !

downtimeManager.log not modified until 2017… strange !

-rw-rw-r-- 1 centreon centreon 2435816 11 oct.   2017 /var/log/centreon/downtimeManager.log

 

You are right table downtime is big

 

MariaDB M(none)]> SELECT count(*) FROM centreon_storage.downtimes;
+----------+
| count(*) |
+----------+
|  1294073 |
+----------+
1 row in set (0.15 sec)

MariaDB >(none)]> SELECT min(entry_time) FROM downtimes WHERE entry_time IS NOT NULL;
ERROR 1046 (3D000): No database selected
MariaDB >(none)]> SELECT min(entry_time) FROM centreon_storage.downtimes WHERE entry_time IS NOT NULL;
+-----------------+
| min(entry_time) |
+-----------------+
|      1496410462 |
+-----------------+
1 row in set (0.22 sec)

 

After updating schema request is about 8 time faster

If I understand correctly downtime table is just an history of all downtime that where created ?

Can I simply empty this table to have only new one ?

What is usage of keeping all downtime ? first entry has 1496410462 as timestamp (02/06/2017)

 

Regards


bellow last log of /var/log/centreon/downtimeManager.log (2017)

PHP Notice:  Undefined index: 67 in /usr/share/centreon/www/class/centreonGMT.class.php on line 373
PHP Notice:  Undefined index: 67 in /usr/share/centreon/www/class/centreonGMT.class.php on line 373
PHP Notice:  Undefined index: 67 in /usr/share/centreon/www/class/centreonGMT.class.php on line 373
 

 

user apache associate to downtimeManager cron entry has nologin associated in /etc/passwd

apache❌48:48:Apache:/var/www:/sbin/nologin

 

I tried to manually run /usr/bin/php -q /usr/share/centreon/cron/downtimeManager.php >> /var/log/centreon/downtimeManager.log 2>&1 but nothing added to log file using (centreon user or root)


Glad to hear that the query is faster with the modified UNIQUE_KEY. 

 

Not to worry, if the file hasn’t been modified it means it ran without errors.

 

I suppose you can define a purge for this table here (menus might be slightly different with your version):

 

 

 

And configure the number of days you want to keep: 

 

 

A cron will apply this new parameter over the night and everything will work more efficiently starting from there. 

 


Thanks again for your quick reply

I have launched manually /usr/share/centreon/cron/centstorage_purge.php

Now I have queries is about 30 time faster

I will see if load increase this evening and let you know

Have a nice day

 

Gregory


Reply