collectl-logging

Langue: en

Version: 57252 (mandriva - 22/10/07)

Section: 1 (Commandes utilisateur)

Overview

Collectl supports 2 very basic data logging mechanisms. In the first case it will log the data as read from /proc to a file with the extension 'raw' or 'raw.gz', depending on whether or not the perl module Compress::Zlib.pm has been installed. If not, one can alway install compression at a later time and collectl will happily use it the next time it is started. One useful property of raw files is that one can play them back using additional switches/options for display or generation of plottable files from them.

The second major form of logging is writing data to one or more plottable files, which have the extension 'tab' for data associated with the 'core' subsystems or one of several other files for the detail data associated with devices like cpus, disk, networks, etc.

For most users and in most situations, using one or the other mechanisms for logging data is sufficient. However, there are situations in which additional logging mechanisms are either necessary or desireable as described below.

More on Raw Files

The biggest benefit of raw files if they are very lightweight to create in that no additional processing is performed on the. Since they contain the original data from which collectl derives its numbers to report, it's always possible to go back to the orginal numbers. In some cases, there is data in the raw file that was easier to collect than ignore and in these case one can actually get at more data.

More on Plottable Files

As their type implies, plottable files have their data in a form that is ready to be plotted, with tools like gnuplot or immedately loadable into a spreadsheet like OpenOffice or Excel or any other tool that can read space-separated data. When generated by collectl while it is running, this data can be easily be read while it is being generated making it possible to do real-time monitoring/display of it.

S-expressions

S-expressions have been around for many years having their earliest roots in programming languages such as List and Scheme, as described in the Wikipedia and offer a semi-structured mechanism for the representation of data. One such environment in which they are heavily used is supermon (see http://supermon.sourceforge.net/) and by providing a mechanism for collectl to write s-expressions, one can more easily supply data to supermon or any other tools that might wish to consume it in [close to] real-time. The actual contents of the s-expressions will be driven by the subsystems for which data is being collected.

There are actually 2 types of values collectl can write into the S-expressions, the first simply being the raw data values as read from /proc which one can request by specifying the 'raw' modifier with --sexpr. With this form, the consumer of the data must perform the necessary calculations to compute the differences between samples and if a rate is desired, to divide by the number of seconds. Raw values are required if one wishes to do any kind of historical analysis evaluation of the data.

On the other hand is one simply wishes to look at the current rates for the various counters the second form, which is requested by specifying the 'rate' option with the --sexpr switch will do just that. However, since these are only snapshots of the actual data, one should only use data stored in collectl logs for any analysis of multiple samples.

The use of S-expressions requires that collectl be in logging mode, that is one has specified a destination with -f. The directory associated with this destination then becomes the default location for the s-expressions. If one wishes to change that directory one can include the new destination with --sexpr.

One should also note that when run on an HP XC Cluster, the actual syntax of the s-expression generated has been extended to make it more easily consumable in that environment.

Logging to a raw file while also logging to a plottable one

The main benefit in requesting collectl write its data in plottable form is that data becomes available for immediate plotting without any post-processing required, the one expense being some additional processing overhead. However there are a few potential limits in doing so that should be understood.

First and foremost, up until now once a plottable file has been created the original data from which it was created is lost forever. In most cases that is fine as there is really no need to go back. However, very often one collects summary data because that is what their interested in but later decides they want to look at the details. This can be easily done by just replaying the raw file and requesting details be displayed or written to a plottable file. If a raw file hadn't been generated, this is not possible.

A second limitation with plottable data files is that one cannot easily examine the data by timeframes and when there are multiple data files involved, it is not easy to look at all the data together as time-oriented samples. It is always possible to write a script that merges this data together, but that functionality is natively built into collectl when used in playback mode.

Finally, there are times when one might wish to go back and look at non-normalized data, for example if one has 3 processes created over a 10 second period collectl will report a rate of 0 process creations/second and the only way to see what really happened is to play the data back with -on, which will tell you the value of the counter not its rate/second.

In most cases none of these restrictions should be a concern, but there may be rare occasions in which they are and that is where the --rawtoo switch comes in. When specified, collectl will generate raw data in addition to the plottable data, making it possible to go back to the source if/when necessary. The only real overhead is the amount of disk space required, but if the plottable files are being generated in uncompressed format, the size of the compressed raw file becomes less significant.

Remote logging when -A is used

This is something that really applies to applications that access collectl data over a socket of which there are currently 2: colmux and colgui. When used in this mode and one specifies -f and -P in the command to collectl, collectl will log plottable data to the location requested. If one addionally specifies --rawtoo and/or --sexpr, collectl will generate the additional files requested. It is currently not possible to send collectl data over a socket and request it to only log raw data.

The overhead

So what is the overhead associated with all this logging? From the perspective of CPU load it can be quite minimal since in many cases the data is already in hand and all that needs to be done is tp write it out to one or more additional files, something that is a fairly low-overhead operation on Linux systems. The one exception to this is requesting --sexpr when not already generating plottable data files because collectl has to look inside the raw data to extrace the values to place into the s-expression, something that has already been done with plottable data.

The only other overhead component is disk space and for that one can do some fairly simple tests to see what the resultant storage requirement would be by running collectl with an interval of 0 seconds and a count equal to the number of samples. For example, when run as a daemon, collectl takes 8640 10 second samples in a day. By creating various types of files by choosing different combinations of logging switches, with and without compressions, one can then determine relative overhead levels. By prefacing the collectl command with 'time' one can even measure the cpu load.

AUTHOR

Copyright 2003-2007 Hewlett-Packard Development Company, LP collectl may be copied only under the terms of either the Artistic License or the GNU General Public License, which may be found in the source kit