Configure Collectd for beginners

Collectd – this is a very easy and efficient service for the collection of various statistics about the system. Due to the huge number of plugins and the ability to create your own, but you can gather information about everything that happens in the system, from CPU, memory, and available disk space, to the number of active requests in Nginx active connections to MySQL and other such parameters.

The program only saves data to storage, you can use the plugins output. The conclusion is supported in StatsD, InfluxDB, ElasticSearch, MySQL and you can also send these data to another service Collectd or just save to a CSV file. In this article, we’ll use InfluxDB to store data and Grafana for their convenient display.

Installing CollectD

To install CollectD with the operating system CentOS 7 you must first connect the EPEL repository. To do this, run:

yum install epel-release

Then install the program itself:

yum install collectd

To install CollectD on Ubuntu use the following command:

sudo apt install collectd

Now you can go to the guide setup.

Configuring Collectd

Let’s start with the configuration file. It is located at /etc/collectd.conf. At the beginning of the file are the parameters, which define the General settings of the plugin. Consider the main ones:

  • Hostname – the hostname which is kept in each record metrics from this service;
  • FQDNLookup – do I need to use DNS to resolve domain names;
  • BaseDir – the main program folder where it has its files;
  • PIDFile – path to the PID file of the program;
  • PluginDir – the directory with plugins for the program;
  • TypesDB – types the default field used in standard Collectd plugins.
  • AutoLoadPlugin – automatic download of the plugin, set this parameter to true to all configured plug-ins are loaded automatically;
  • Interval – interval data collection is usually sufficient to collect data every 30 seconds or every minute.
  • MaxReadInterval – the maximum read data from a plugin;
  • CollectInternalStats – collect internal statistics Collectd disabled by default;
  • Timeout – the number of attempts to read data from the plugin before will be written to a blank value, default is 2;
  • ReadThreads – number of threads to read data from a plugin;
  • WriteThreads – number of threads for writing data.

For example, my configuration looks like this:

Let’s get the program to work somehow, and then we’ll add the required plugins.

1. Setting of data output

As I said, the program has plugins to read and display the data. To save the captured data, you need to use the plugin output. Here is a list of the supported plugins:

  • network
  • unixsock
  • graphte
  • http
  • kafka
  • log
  • mongodb
  • redis
  • redis
  • riemann
  • sensu
  • tsdb

In this article I want to use influxdb. This does work with the plugin network. We will connect to InfluxDb and send her the data. For this, we add this configuration:

<Plugin "network">
Server "localhost"

Here we specify the server running InfluxDB, you can also specify the port if different from default port. Then move on to configuring InfluxDB. In the configuration file /etc/influxdb/influxdb.rasskomentiruyte conf and find these lines:

vi /etc/influxdb/influxdb.conf

enabled = true
bind-address = ":25826"
database = "DatabaseName"data
retention-policy = ""
batch size = 5000
batch-pending = 10
batch-timeout = "10s"
read-buffer = 0
typesdb = "/usr/share/collectd/types.db"

The value of the parameter bind-address change is not necessary, it is important that this port remains the way it is, and if you change the port, then it should be specified in Collectd. In the parameter database specify the database name in typesdb write the same value as for typesdb in the file collectd.conf. Then restart InfluxDB:

sudo systemctl restart influxdb

2. Plugin CPU

For example, add data about CPU usage with the plugin CPU. To do this, add the following lines to the configuration file:

<Plugin "cpu">
ReportByState true
ReportByCpu true
ValuesPercentage true

Here the values of these parameters:

  • ReportByState – display statistics for system, user, idle, and other statuses;
  • ReportByCpu – display statistics for each cpu core separately;
  • ValuesPercentage – output value in percent.

Next you need to restart Collectd to apply the changes:

sudo systemctl restart collectd

Then need to make sure the data comes

 USE database statement;

SELECT * FROM cpu_value LIMIT 5

If data is being received, you will see some lines with data. So all is well, and you can proceed to configure Grafana. Read more about it I wrote in the article installing and configuring Grafana. For example, settings for output of CPU utilization as follows:

You can get here this pane:

You can see metrics collectd and using other services. And then let’s examine how to configure individual plug-ins.

2. Plugin Memory

Using the Memory plugin you can see how much RAM is free and how much used for different needs. To activate it add:

<Plugin "memory">
ValuesAbsolute true
ValuesPercentage true

Here the first parameter includes data output in absolute value and the second percentage.

3. Plugin df

Using the df plugin you can see how much free space is left on disk and how much space is already taken:

<Plugin "df">
Device "/dev/xvda1"
MountPoint "/"
IgnoreSelected false
ReportInodes false
ReportReserved false
ValuesPercentage true


  • Device is the disk partition that we want to track;
  • MountPoint is the mount point, which is necessary to view information;
  • ValuesPercentage – output value in percent.

4. Plugin load

Plugin load feature allows you to see the current load on the system based on the file /proc/loadavg:

<Plugin "load">
ReportRelative false

5. Plugin nginx

The nginx plugin allows you to display usage statistics of a web server that access through the plugin stub_stats. It is best to use to create a separate virtual host to Nginx, which will listen to connections only on the local IP. For example:

server {
listen default;
server_name _;
server_name_in_redirect off;
location / {
stub_status on;
access_log off;

Then add this host plugin Collectd:

<Plugin "nginx">
URL "http://localhost:8084"

6. Plug tail

This is the most interesting plugin, as it allows you to read any files such as log files, the terminal tail and on the basis of regular expression to select the metrics. In this example, we read the nginx log file and count how many return codes 200, 300, 400 and 500. The configuration will look like this:

<Plugin "tail">
<File "/var/log/nginx/domains/losst.EN.log">
Instance "nginx"
The Regex "HTTP/1.1" 2[0-9]{2}"
DSType "GaugeInc"
Type "gauge"
Instance "http_2xx"
The Regex "HTTP/1.1" 3[0-9]{2}"
DSType "GaugeInc"
Type "gauge"
Instance "http_3xx"
The Regex "HTTP/1.1" 4[0-9]{2}"
DSType "GaugeInc"
Type "gauge"
Instance "http_4xx"
The Regex "HTTP/1.1" 5[0-9]{2}"
DSType "GaugeInc"
Type "gauge"
Instance "http_5xx"

In the parameter File we specify the path to a log file that will be analyzed. Instanse is the name of the delimiter that identifies this plug-in.. Each block Match contains a regular expression and field configuration, in which the result will be saved. Each Match has its field Instanse, which is the field name into which data will be recorded.

Type is the type of field, but it only applies during the filtering of the data in Grafana or other tool as desired, for example to save data in percent and bytes. The most interesting parameter here is DSType, he is responsible for the processed and stored data. Here are the available types:

  • Gauge – suitable for those parameters that you want to display as is, without any changes.
  • Counter – suitable for parameters that you need to compare them with the previous value. The program is supposed to subtract from the current value the previous one. The parameter value can only increase;
  • Derive works similarly, only here the value can both increase and decrease.

Each of the parameters there are three options of application. This Inc – amount Max – the maximum value and Average – average. In the example I used GuageInc, as I need to save the value as is.

The values Derive and Counter appropriately displayed in Grafana, you can add the item SELECT function derivative from the submenu transformation, then the graphics look more clearly. And when we see just a straight line, going obliquely up, understand it is something difficult.

7. Plugin MySQL

Using the plugin mysql you can track usage statistics MySQL database, such as total traffic to the database, the number of select-, insert-, update-queries, and other parameters. He has such configuration:

<Plugin "mysql">
<Database "DatabaseName">
Host "hostbased"
User "user"
Password "password"
Port "port"
MasterStats true

To work this plugin you must complete the data access to the database. You must specify the host (Host), user (User), password of this user (Password), the port on which waiting for a connection to the database service (Port). We also need to specify the name of the database statistics which should be collected.

8. Sending data over the network

As I said, you can not only store data in multiple storage plugins, but to refer them to another service Collectd to save. It uses a plug-in network. To enable the server, use this code:

<Plugin "network">
<Listen "ip_address">
SecurityLevel None

Here you need to specify the IP address on which the server will listen to connection. Since the authorization we shut down, it is better to configure iptables so that our server could connect only the desired IP address. And in the client configuration, it is sufficient to add a connection to the server:

<Plugin "network">
Server redresseur

All. Now the data will be transmitted to the desired server. After making any configuration changes, you must restart Collectd:

sudo systemctl restart collectd


In this article we have looked at how to install and configure Collectd for beginners. Despite the fact that this is a very lightweight service, it has great functionality and will allow you to completely monitor your system.


(Visited 78 times, 1 visits today)