Collectd

From HerzbubeWiki
Jump to: navigation, search

collectd

Debian packages

collectd

Note: Some plugins require libraries and/or daemons provided by separate packages. These packages are installed through a "Recommends" dependency of the main collectd package. This is explained in /usr/share/doc/collectd/README.Debian.plugins.


References

collectd Website 
http://collectd.org/
configuration 
man collectd.conf
plugins 
some plugins have a small separate man page; try man collectd-<plugin>
collectd wiki 
Also has interesting information, notably the table of plugins lists some plugins that are not documented on man pages. http://collectd.org/wiki/index.php/Table_of_Plugins


Configuration

Overview

The configuration is stored in

/etc/collectd/collectd.conf


Global options

  • the default settings seem to be OK
  • default interval to query plugins = 10 seconds
  • default data directory = /var/lib/collectd
  • default directory to load plugins from = /usr/lib/collectd


Active plugins

Currently the following plugins are in use:

apache 
Apache web server statistics
cpu 
CPU load; Note: this is not documented on any man page
cpufreq 
CPU frequency
df 
Filesystem usage
entropy 
Entropy available on the system
interface 
Traffic on network interfaces
irq 
Triggered IRQs
load 
Load averages (same as printed by the w command line utility)
memory 
Memory usage
ntpd 
NTP statistics (what statistics, exactly?)
processes 
Statistics for selected processes
rrdtool 
Stores statistics in RRD files; Note: requires separate package rrdtool
swap 
Swap filesystem usage
syslog 
Lets collectd send its output to syslog
users 
Number of users


Interesting plugins

The following plugins look interesting, but I have not yet got around to find out more about them. Here are some general notes:

  • bind
  • email
    • see man page collectd-email
    • opens a socket that can be used for inserting email statistics into the collectd system
    • can be used with the SpamAssassin plugin Mail::SpamAssassin::Plugin::Collectd, and others
  • iptables
    • see man page collectd.conf
  • mbmon
    • Hardware monitoring without kernel support (seems to have some sort of direct access to the motherboard sensors)
    • requires separate package mbmon
  • mysql : MySQL database server statistics
  • netlink
    • network statistics acquired directly from the kernel
    • provides far more details than the interface plugin
  • postgresql
  • sensors
    • collects statistics provided by the package lm_sensors
  • tail
    • can be used to collect statistics from log files


Unused plugins

The purpose of the following plugins is either unknown, and/or the plugin is of no interest to me. I list them for completeness sake because I found a reference to them in collectd.conf:

apcups 
apcupsd statistics (a daemon for controlling APC UPSes)
ascent 
Statistics about an Ascent server (a free server for the "World of Warcraft" game)
battery 
Battery statistics
conntrack 
 ?
contextswitch 
 ?
csv 
CSV statistics
curl 
 ?
curl_json 
 ?
curl_xml 
 ?
dbi 
 ?
disk 
Hard drive usage
dns 
DNS statistics, obtained by monitoring network traffic
exec 
forks off an executable either to receive values or to dispatch notifications to the outside world; see man collectd-exec
ipmi 
 ?
ipvs 
 ?
java 
 ?
libvirt 
Collects CPU, disk and network load for virtualized guests on the machine; uses libvirt (http://libvirt.org)
madwifi 
 ?
memcachec 
 ?
memcached 
memcached statistics (cache utilization, memory and bandwidth used)
multimeter 
 ?
network 
 ? (see man collectd.conf)
nfs 
 ?
nginx 
nginx statistics (an HTTP and mail server/proxy)
notify-desktop 
 ?
notify-email 
 ?
nut 
collect data from UPSes
olsrd 
 ?
openvpn 
 ?
perl 
embeds a Perl interpreter into collectd so that plugins written in perl can be used; see man collectd-perl
pinba 
 ?
powerdns 
statistics for PowerDNS nameserver and/or PowerDNS recursor
protocols 
 ?
python 
 ?
rrdcached 
 ?
serial 
 ?
snmp 
SNMP statistics; see man collectd-snmp
table 
 ?
teamspeak2 
teamspeak2 server statistics
ted 
 ?
thermal 
 ?
tokyotyrant 
 ?
unixsock 
 ? (see man collectd.conf)
uptime 
 ?
uuid 
causes the hostname to be taken from the machine's UUID, which is usually taken from the machine's BIOS; this is useful if the machine is running in a virtual environment such as Xen)
vserver 
 ?
wireless 
 ?
write_http 
 ?


Disabled plugins

Although I would have liked to use the following plugins, I had to disable them for various reasons.

hddtemp 
Hard disk temperature; Note: requires daemon installed by separate package hddtemp
  • produces segmentation fault
ping 
Connection to hosts
  • produces output like the following, and crashes the collectd daemon
collectd: liboping.c:168: ping_timeval_sub: Assertion `(res->tv_sec > 0) || ((res->tv_sec == 0) && (res->tv_usec > 0))' failed.
tcpconns 
TCP connections
  • produces segmentation fault
vmem 
Virtual memory usage
  • produces segmentation fault


Plugin configuration

Some plugins require special configuration:

  • apache
    • URL http://localhost/server-status?auto
  • df
    • Filter out special filesystems (e.g. /dev) by specifying a filesystem type filter
    • FSType "ext3"
  • dns
    • Interface "eth1"
    • use the interface that points to the Internet since I currently do not have my own DNS service
  • mysql
    • Host "localhost"
    • User collectd
    • Password secret
    • the database user collectd is generated automatically when the Debian package is installed
    • the database user only has very general privileges
    • the password is randomly generated
  • ping
    • Host "www.google.ch"
    • Host "alcarondas"
    • TTL 255
  • processes
    • Process "/usr/sbin/mysqld"
    • Process "/usr/sbin/apache2"
    • Process "/usr/sbin/slapd"
    • Process "/usr/sbin/smbd"
    • Process "/usr/sbin/nscd"
    • Process "/usr/sbin/named"
  • rrdtool
    • DataDir "/var/lib/collectd/rrd"
  • syslog
    • LogLevel warning
    • this is essential! the default log level is "info", which produces megatons of output and creates multi-megabyte sized logcheck digests
  • vmem
    • Verbose true


Access to collected data

Overview

collectd is able to write data to CSV (comma separated list) and RRD (round robin database - see chapter further down) files. However, it does not create graphs from these files. This task is delegated to separate scripts and packages. The collectd package provides a number of such scripts/packages in

/usr/share/doc/collectd/examples

See the file /usr/share/doc/collectd/README.Debian.gz for details.

In my experience, though, these scripts do not work properly, or at least not without more work than I care to invest. For this reason, I am currently using a manually installed instance of collectd-web, a modern web interface that pretty much "just works".


collectd-web

At the time of writing there is no Debian package for collectd-web, so it needs to be manually installed:

cd /var/www
git clone https://github.com/httpdss/collectd-web.git


The following general snippet in /etc/apache2/conf-enabled/000-pelargir.conf makes collectd-web available on all virtual hosts:

# ============================================================
# Settings for collectd-web
# ============================================================
Alias /collectd-web/ /var/www/collectd-web/
<Directory /var/www/collectd-web/>
  Require all granted
  Options Indexes FollowSymLinks MultiViews
  AllowOverride all
</Directory>


And that's it - virtually zero-conf on the part of collectd-web.


collectd2html.pl

When executed, collectd2html.pl generates a directory with graphs and a static HTML file which embeds those graphs.

In my tests, collectd2html.pl only produced empty graphs.


collection.cgi

collection.cgi generates graphs on the fly. It requires the following perl Debian packages to be installed:

  • librrds-perl
  • liburi-perl
  • libhtml-parser-perl

Installation:

cp /usr/share/doc/collectd/examples/collection.cgi /usr/lib/cgi-bin
chmod +x /usr/lib/cgi-bin/collection.cgi

Access:


collection3

collection3 is the successor for collection.cgi. collection3 is an entire small package that consists of multiple files organised in several directories in a typical UNIX style (etc, bin, lib, ...)

The following perl Debian packages need to be installed:

  • librrds-perl
  • libconfig-general-perl
  • libhtml-parser-perl (provides HTML::Entities)
  • libregexp-common-perl


Installation:

cp -Rp /usr/share/doc/collectd/examples/collection3 /usr/lib/cgi-bin/collection3
mv /usr/lib/cgi-bin/collection3/etc /etc/collection3
ln -s /etc/collection3 /usr/lib/cgi-bin/collection3/etc


Although the example files provided by the collectd package have been correctly set up when I installed them the first time, the following things should be checked again on each subsequent update:

  • only files in /usr/lib/cgi-bin/collection3/bin may have the executable bit set
  • only files in /usr/lib/cgi-bin/collection3/bin may be be served by apache "as is"; all other directories should contain a .htaccess file that provides some restrictions
    • etc/.htaccess und lib/.htaccess should look like this
deny from all
    • share/.htaccess should look like this
Options -ExecCGI
SetHandler none

Note: For the .htaccess files to take effect, /etc/apache2/conf.d/pelargir.conf must say "AllowOverride All" for the directory /usr/lib/cgi-bin


Access:


hddtemp

Debian packages

hddtemp


References

hddtemp Website 
http://www.guzu.net/linux/hddtemp.php


Configuration

DebConf questions:

  • Do you want /usr/sbin/hddtemp to be installed SUID root? = no
  • Interval between two checks = 10
  • Do you want to start the hddtemp daemon on startup? = yes
  • Interface to listen on = 127.0.0.1
  • Port to listen on = 7634


RRDtool

Debian packages

rrdtool


References

Homepage 
http://oss.oetiker.ch/rrdtool/
Tutorial 
man rrdtutorial


Glossary

RRD 
Round Robin Database. Is represented by a file
DS 
Data Source. An RRD may contain 1-n DS
DST 
Data Source Type (e.g. COUNTER)
RRA 
Round Robin Archive. An RRD may contain 1-n RRA
PDP 
Primary Data Point. An entry inside an RRA which has the original value obtained from a DS
CDP 
Consolidated Data Point. An entry inside an RRA which is calculated by processing some PDPs
CF 
Consolidating Function. Processes PDPs to calculate a CDP


Creating a database

rrdtool create test.rrd                \
            --start 920804400          \
            DS:speed:COUNTER:600:U:U   \
            RRA:AVERAGE:0.5:1:24       \
            RRA:AVERAGE:0.5:6:10
  • creates database in file test.rrd
  • starts the database at time_t 920804400 (= March 7, 1999)
    • alternative ways to specify dates (using the "at-style", see "man rrdfetch" for details)
      • keywords referring to a specific date/time: now, yesterday, today, tomorrow, midnight, noon, teatime
      • keywords referring to amounts of time: years, months, weeks, days, hours, minutes, seconds
      • March 8 1999
      • 23:59 31.12.1999
      • 19970703 12:45
      • noon yesterday -3hours (same as "9am-1day")
      • -5h45min (same as "-6h+15min")
  • database contains 1 data source (DS)
    • data source is named "speed" (max. 19 characters)
    • data source type (DST) is "counter"
      • possible DSTs: gauge, counter, derive, absolute, compute
      • DST determines remaining arguments of the data source entry
    • data source has a "heartbeat" of 600 seconds (5 minutes)
      • the heartbeat defines the ***maximum*** number of seconds that 2 samples may be apart before a value of *UNKNOWN* is assumed
      • if during a single heartbeat, multiple samples are fed into the database, an average rate is calculated from these samples
    • we don't expect values to be in a specific range
      • U = "unknown"
      • range is given as minimum:maximum
      • if a min/max value is specified and the actual value of the data source is outside of that range, the value is assumed to be *UNKNOWN*
  • database contains 2 round robin archives (RRA)
    • every RRA stores a number of values for ***each*** DS
    • both RRAs in the example are fed one value every 300 seconds
      • 300 seconds because the "--step" option is omitted; this option defaults to 300 seconds
      • the value fed into the RRA is called "primary data point" (PDP)
      • the PDP is the average of all the DS samples fed into the database since the time of the last PDP
    • both RRAs also employ the "consolidating function" (CF) AVERAGE to calculate and store a "consolidated data point" (CDP) alongside the PDP
      • possible CFs are: AVERAGE, MIN, MAX, LAST
      • CFs use PDPs (not DS samples) as their parameters
      • remaining arguments are the same for all CFs
    • for both CFs, the percentage of allowed *UNKNOWN* values is 0.5
      • if a CF gets more PDPs with an *UNKNOWN* value than what is allowed, the resulting CDP will be *UNKNOWN*
    • the first AVERAGE function uses 1 PDP, the second 6 PDPs to calculate their respective CDP
    • the first RRA stores 24 values, the second RRA stores 10 values


Writing to a database

rrdtool update test.rrd 920804700:12345 920805000:12357 N:12363
[...]
  • updates database in file test.rrd
  • multiple (3) values are fed into the database
  • a value argument consists of one timestamp and 1-n values ("timestamp:value[:value...]")
    • the number of values must match the number of data sources
    • in the example, we see that the database contains only 1 data source
    • in the example, two values have a time_t timestamp and one value has the "now" timestamp
    • the timestamp may be specified using the at-style by using "@" instead of ":" as the delimiter


Reading from a database

rrdtool fetch test.rrd AVERAGE --start 920804400 --end 920809200

Output (as of RRDtool 1.2.0):
                          speed

 920804700: nan
 920805000: 4.0000000000e-02
 920805300: 2.0000000000e-02
 920805600: 0.0000000000e+00
 920805900: 0.0000000000e+00
 920806200: 3.3333333333e-02
 920806500: 3.3333333333e-02
 920806800: 3.3333333333e-02
 920807100: 2.0000000000e-02
 920807400: 2.0000000000e-02
 920807700: 2.0000000000e-02
 920808000: 1.3333333333e-02
 920808300: 1.6666666667e-02
 920808600: 6.6666666667e-03
 920808900: 3.3333333333e-03
 920809200: nan
  • prints values every 300 seconds
    • 300 seconds because the "--resolution" option is omitted, in which case rrdtool selects the RRA with the finest resolution
    • because we omitted the "--step" option when the database was created, both RRAs have a resolution of 300 seconds (the default for "--step")