Thursday, May 3, 2007

Cacti Performance Tuning

When using the Thold plug-in under Cacti, the three background pollers are:

1.poller.php
2.cactid
3.thold

Poller.php is the master process run by cron which kicks off cactid to do the actual data collection and then creates graphs and calls plugin functions in the threshold plugin to update its data set.

II.Speed Bottlenecks

When using Cactid and the Thold plug-in, the dominant bottlenecks have to do with its usage of MySQL.

I.Cactid

Cactid is performance-limited to some extent by a lack of internal caching. However, this can be partially ameliorated by changing the value of MAX_MYSQL_BUF_SIZE in cactid's util.h and recompiling it. This value refers to a buffer of mysql update's which are kept in RAM before sent out as a block write. The larger this number, the more writes are internally cached and given to mysql to update in a block, which is more efficient than making many individual calls.

More can be done to Cactid to improve its mysql performance, and this would require more in-memory caching and globally holding open database connections.

The following also should be taken note of:

The dynamics of threading (configurable through the Cacti Settings/Poller UI)

If you increase the number of threads in Cactid, it will run significantly faster, but the impact on mysql will be commensurately greater. You need to carefully balance the load against mysql in order to keep other server performance usable.

II.Poller.php

Poller.php's dominant bottleneck is that it is written in PHP. If you observe its performance, while it is running through the result set of cactid, PHP jumps to the status of top CPU consumer on the server.

III.Thold Plug-in

The easiest way to make the Thold plug-in faster is by allowing for a greater number of concurrent poller processes. The poller calls the plugins, and blocks on the plugins return. The more concurrent pollers you have, the more concurrent plugins you have, it is a one-to-one correspondence.

Thold is very inefficient internally as it performs no internal caching, but operates one host at a time, one threshold at a time through its datastore, requiring a large number of discrete mysql calls. In contrast, poller.php pulls down a full set of data sources in a single mysql call, caches them in an internal array, and then performs its operations from the internal array.

In the longer term, to make thold more scalable would require that greater internal caching take place within the plugin's code. It is not performing a complex task, however it is doing its task quite slowly.

1 comment:

Patrick Best said...

I'd actually recommend against using the integrated cmd.php when tuning cacti. Use the Spine poller instead.
I've written an article on tuning with the spine poller here:

http://realworldnumbers.com/cacti-tuning-how-to-set-maximum-oids-per-get-request/