Saturday, September 20, 2008

Reports

It's all very well to have a display of what is going on in your system at the moment - but what's been happening during the course of last month? Can you compare last year to today? How can I prove that whatever-change-was-just-made has (or has not) made a difference?


Sure, you can hack together an SQL routine or Perl script to get the raw data out of the monitoring system, but then what about showing your conclusions to someone who doesn't speak your type of jargon? You need something which creates reports which have been made for human eyes - not man/machine hybrids.

In other words, you need reports so you can translate your technical knowledge into business knowledge and in that way share your IT information with the decision making sections of your company/organization.

 

image 

TCR1

image

One of the nicer ideas in Tivoli at the moment is the gradual merging of all the various reporting routines in the myriad Tivoli products.

IBM has taken the standard BIRT reporting system and wrapped it up as Tivoli Common Reporting (TCR) - all the cooler newer versions of the Tivoli family have their reports in this new standard. The site I linked to has a list of all the TCR offerings. More and more of them are published on OPAL all the time. ITM6.2 reports have just had an update, for example.

Using BIRT means that the reporting engine is (a) free and (b) easily customizable - for those who know what they're doing with it. Alas, I'm not yet quite good enough with BIRT to create my own extra-special reports.

The next version of TADDM will use these reports and I'm curious as how to go about creating a "mashup" of a report which merges CMDB data with monitoring events - for example, how about a report which shows Number of Failures as a function of Number of Configuration Changes across the organization?

-- Robert

Friday, September 12, 2008

Flush the buffer

A recent conversation in the Developerworks ITM 6.x forum dealt with an unusual problem with Universal Agents.
A few rows were showing up cropped - only the first part of the line was making it's way from the UA to the TEMS/TEPS database.

The root of the problem turned out to be the UA's "unflushed buffer". What is a buffer and why should it be flushed?
A computer program, be it a simple "hello world" application or a complex missile control system, can often be summarized like this:

Get Input -> Do Something -> Write Output and repeat.

Now, things get complicated when we try to be more efficient. Say we've got a script which is checking a series of files/disks/servers/anything. If it wrote the result of each check immediately then the hard disk writing heads would constantly be starting and stopping - which is not efficient. What happens (behind the scenes) is that the operating system creates a Buffer of information which holds the lines temporarily. Once enough lines have been written into the buffer, the buffer is written (flushed) to the disk. This translates as much less starting and stopping of the disk heads. If you run the script yourself then you'll see everything displayed on the screen because the operating system will flush the buffer at the end of the script - at the latest.

Now, I don't know what the exact reason for the UA losing parts of lines, but I assume that the way the script UA works is reading the buffer that the script writes. If the UA reads the buffer BEFORE the operating system flushes the buffer then the UA will only get part of the information. This will not affect all UAs, but it might come and bite a few.

There is a full solution in the works, but a good workaround (for Perl) in the meantime is to add the following line to the beginning of the script:

BEGIN { $| = 1; }

This will make sure that the operating system flushes the buffer at the end of every output operation and that guarantees that the script and the UA are synchronized. If you're using other script languages, you'll have to find the equivalent function.

This also happened to me way-back-when when I was writing DCL scripts for the OpenVMS operating system. Just goes to show that what comes around, goes around! It also demonstrates that we Tivoli types need a good background in general computer science knowledge to help us solve fiddly little problems.

-- Robert