Sunday, November 9, 2008

New versions...

I guess it's just that time of the year again!

TBSM 4.2 came out recently, as did Impact 5.1.

Yesterday I saw both TADDM 7.1.2 and ITM 6.2.1 in passport advantage.

Here's a brief list of TADDM changes from the release notes:

TADDM 7.1.2 gives you the rich details of configuration items with automated, agentless discovery of the assets and their application dependencies, as well as a Discovery Library technology to help leverage data from other sources.

TADDM is a configuration management tool that helps IT operations personnel ensure and improve application availability in application environments. The operational staff gets a top-down view of applications so the staff can quickly understand the structure, status, configuration, and change history of their business-critical applications. This view immediately isolates issues in times of performance or availability problems and enables more effective planning for application change without disruption. An agent-free creation and maintenance of a Configuration Management Database (TADDM database) is delivered without requiring custom infrastructure modeling. TADDM also provides complete cross-tier dependency maps, topological views, change tracking, event propagation, and detailed reports and analytics.

The following list includes the functionality that was added for the TADDM 7.1.2:

  • BIRT Report Infrastructure
  • Limited IPV6 Capability
  • Console Installation Capability
  • Improved View Performance
  • Improved Details Performance
  • Improved ECMDB synchronization times
  • Improved API query performance
  • Improved post-discovery processing performance
  • Improved TBSM Integration
  • Drill Down Capability for Business Applications
  • Comparison Report From Domain Manager
  • Cross-Domain Comparison reports from ECMDB
  • Additional MQ Cluster Information for zOS
  • Reduced WAN traffic during anchor usage
  • Upgraded first failure data capture tools
  • Weblogic 9.x and 10.x sensor support
  • Simplified migration from previous releases
  • Windows 2008 Support
  • AIX 6.1 Support
  • Bug Fixes

Updated documentation is here.

ITM 6.2.1 doesn't seem to have updated the documentation yet, but here's a list I have of it's changes:

    • Adaptive Monitoring
    • TEPS Changes
      • Event Slot Customization - replaces my older post!
      • Managed node search bar
      • Zoom in charts
      • SSO with Java Webstart
    • SPB bundles for TCM/TPM
    • More CLI abilities
      • Replace wadminep function (getfile, put file, list dir, execfile)
      • CLI for historical data configuration collection
      • Remotely invoke pdcollect tool
      • Expand tacmd createsit (display item, consecutive samples, state)
    • Agent Builder
      • Support 100+ connections for remote monitoring
      • Browser for logfile/script monitors
      • Add CIM provider
    • Out of the box Agentless OS monitoring packages
    • Infrastructure improvements
      • 64-bit zLinux and AIX support
      • Support TDW on zOS
      • Support TEC events from z Hub
      • Manage agent fail-over
      • Asynchronous Deployment
      • Support 64 bit counters
      • Agent Manager Services - an extra agentlet who manages the regular agents
    • Improved TADDM integration

Of those, the Adaptive Monitoring and Event Slots seem extremely interesting and I really want to try them out!

-- Robert

Edit: forgot a few things in ITM...

Tuesday, October 21, 2008

Ancient Greek Technology

An exhibition on ancient Greek technology titled "Greece and Technology: a perspective through time" will be soon be hosted at the premises of the "Teloglion Foundation of Art" in Thessaloniki, organized by the University of Thessaloniki (Faculty of Sciences).

 

Antikythera Mechanism
The exhibition will open on November 12 and will run through January 11, 2009. The purpose of the exhibition is to highlight the recent scientific findings which show that ancient Greeks were technologically more advanced that it was previously thought.
Within the framework of the exhibition a series of lectures has been announced to take place within 2008. 
Among the lectures delivered, one will focus on the famous Antikythera Mechanism , a device which could predict eclipses decades in advance and was also used to record the four-yearly cycle of the original Olympic Games.

 

Visit the co-organizing bodies: Thessaloniki Science Center and Technology Museum (in Greek)
and the Society for Ancient Greek Technology Studies; Ancient Greece OnLine

 

-- Robert

Wednesday, October 15, 2008

How shall I TADDM the network, if I don't know what it's like?

TADDM, which I've written before, is IBM's detective. It sniffs out and find everything that's connected to the network, interrogates them to find out what's inside and maps out the relationships (who's talking to whom and and who's ignoring whom).

Now, the first time I have to remove the dewy-eyed look from users is when I inform them that, yes, TADDM can discover their unique 3rd-party/in-house applications, BUT... we have to tell TADDM what these products look like so that it can "tag" them correctly. They want TADDM to be able to "guess" or even divine which of the thousands of processes and files scattered around the network are related to which business service they run. TADDM can do that, but it need at least a little help!

The other thing they don't like is when I have to feed TADDM the TCP/IP address scopes and user/passwords of their servers. They say "Why do I need to enter all these things if TADDM is supposed to discover everything by itself?". I must then reassure them that they do not want TADDM to be able to hack into their secure systems without the right credentials!
OK, so TADDM requires the users/passwords for the servers (it can also get the passwords from the organizations LDAP repository) but why does it need me to enter the IP addresses?

For two good reasons: (1) So that  you can give each IP list (scope in TADDM jargon) it's own set of credentials, i.e. IP 200.100.10.1 through 200.100.10.255 uses a certain user/password and a another scope uses a different one. (2) So that you can schedule discoveries by scope, and that way you're not doing the whole net each time.

But... what happens when an organization doesn't know how to split up it's IP range into separate environments? Say there are 200 servers, split into development, test, production and DRP environments all overlapping in the same IP scope. If I tried running discovery, I'd get tons of errors for using wrong user/passwords and I'd get a lot of unneeded devices when all I want is the production environment, unless I matched up IP to environment first.

Of course, the whole point of TADDM is that you don't necessarily know your network ahead of time. So what do you do?

The solution is a bit iterative, but you're not running into errors on the way and you don't need to feed TADDM any knowledge about the way the network is mapped ahead of time. What happens is that TADDM creates something called Logical Connections between devices when it detects that one uses the other. Logical Connections can be between application servers and databases, web servers and storage area network devices, switches and ... you get the idea). Now, once TADDM has completed a discovery, it maps out the Logical Connections between items which lie within it's scope. LC which go beyond the scope are listed, but not displayed on the map.

To leverage this, what we do is get our nose under the tent - we discover a few servers we know are in the right environment, find out which servers they are connected to, add them to the scope and continue till we have discovered everything in the environment we are interested in.

There are two ways of listing Logical Connections outside of the TADDM GUI:

  1. ./api.sh -u username-p password find LogicalConnection
    api.bat for Windows users.
  2. Connect to the database and run select distinct fromip_x, toip_x from logcconn

The first command will create an XML file with all the data, the second regular SQL. I used the second command (slightly modified to ignore 127.0.0.1 and output the data nicely) to create a script file which runs loadscope after the select and then I get an ever expanding scope which contains items which are talking to each other.

Don't forget that loadscope looks for a file formatted like this:

IP_Address/range/subnet, Exceptions, Description 
IP_Address/range/subnet, Exceptions, Description 
IP_Address/range/subnet, Exceptions, Description

Thanks to Byron for the help :)

Hmph.... This post seems a little too long for what I wanted to say... I hope I was clear in the end.

-- Robert

Tuesday, October 14, 2008

Hannibal - The Webcomic

Many people's first exposure to the ancient world of Rome is through the books they read as children. I first encountered Julius Caesar in reading the legendary comic book series Asterix. Mixing the old world with new technologies means that web-comic interpretations of the classical world  were just a matter of time:

From <http://newsok.com/hannibals-epic-campaign-comes-to-web-based-comic/article/3309191>:
=============================================================================
Hannibal's epic campaign comes to Web-based comic
WORD BALLOONS
BY MATTHEW PRICE
Published: October 10, 2008
Most have heard the story of the Carthaginian general Hannibal leading elephants across the Alps to face the Romans. Writer Brendan McGinley wants you to see it.
"There's already plenty of good prose about Hannibal, (but) no good visual medium for a story that crackles with so many unforgettable images, like elephants on the Alps or Mago Barca spilling dead Romans' rings on the Senate floor," he said. "Maybe Vin Diesel's long-stalled film will change that; Victor Mature's sure didn't."
McGinley and artist Mauro Vargas, along with colorist Andres Carranza, bring the Hannibal story to life — with some humorous asides — re-enacting the second Punic War on the Shadowline Web comics page, <http://www.shadowlinecomics.com/webcomics/#/hannibalgoestorome/>.

Vargas "really defines and expresses his characters; you need that where history meets comedy," McGinley said. McGinley said the trickiest part of creating "Hannibal Goes to Rome" is sorting which Carthaginian did what.
"There are so many Hannos, Hannibals, Hasdrubals and Giscos!" McGinley meshes historical accounts to create the tale, which he then passes on to Vargas to draw. "The historians and artist make it easy for me; all I have to do is throw a little observational humor into the mouths of the poor schlubs caught up in events," he said.
"Hannibal Goes to Rome" was first a candidate on DC Comics' Zuda site (www.zudacomics.com). Zuda is a site created to seek fresh talent.
After competing on Zuda, McGinley hooked up with Shadowline's Jim Valentino, who was looking to launch some Web comics.

Saturday, September 20, 2008

Reports

It's all very well to have a display of what is going on in your system at the moment - but what's been happening during the course of last month? Can you compare last year to today? How can I prove that whatever-change-was-just-made has (or has not) made a difference?


Sure, you can hack together an SQL routine or Perl script to get the raw data out of the monitoring system, but then what about showing your conclusions to someone who doesn't speak your type of jargon? You need something which creates reports which have been made for human eyes - not man/machine hybrids.

In other words, you need reports so you can translate your technical knowledge into business knowledge and in that way share your IT information with the decision making sections of your company/organization.

 

image 

TCR1

image

One of the nicer ideas in Tivoli at the moment is the gradual merging of all the various reporting routines in the myriad Tivoli products.

IBM has taken the standard BIRT reporting system and wrapped it up as Tivoli Common Reporting (TCR) - all the cooler newer versions of the Tivoli family have their reports in this new standard. The site I linked to has a list of all the TCR offerings. More and more of them are published on OPAL all the time. ITM6.2 reports have just had an update, for example.

Using BIRT means that the reporting engine is (a) free and (b) easily customizable - for those who know what they're doing with it. Alas, I'm not yet quite good enough with BIRT to create my own extra-special reports.

The next version of TADDM will use these reports and I'm curious as how to go about creating a "mashup" of a report which merges CMDB data with monitoring events - for example, how about a report which shows Number of Failures as a function of Number of Configuration Changes across the organization?

-- Robert

Friday, September 12, 2008

Flush the buffer

A recent conversation in the Developerworks ITM 6.x forum dealt with an unusual problem with Universal Agents.
A few rows were showing up cropped - only the first part of the line was making it's way from the UA to the TEMS/TEPS database.

The root of the problem turned out to be the UA's "unflushed buffer". What is a buffer and why should it be flushed?
A computer program, be it a simple "hello world" application or a complex missile control system, can often be summarized like this:

Get Input -> Do Something -> Write Output and repeat.

Now, things get complicated when we try to be more efficient. Say we've got a script which is checking a series of files/disks/servers/anything. If it wrote the result of each check immediately then the hard disk writing heads would constantly be starting and stopping - which is not efficient. What happens (behind the scenes) is that the operating system creates a Buffer of information which holds the lines temporarily. Once enough lines have been written into the buffer, the buffer is written (flushed) to the disk. This translates as much less starting and stopping of the disk heads. If you run the script yourself then you'll see everything displayed on the screen because the operating system will flush the buffer at the end of the script - at the latest.

Now, I don't know what the exact reason for the UA losing parts of lines, but I assume that the way the script UA works is reading the buffer that the script writes. If the UA reads the buffer BEFORE the operating system flushes the buffer then the UA will only get part of the information. This will not affect all UAs, but it might come and bite a few.

There is a full solution in the works, but a good workaround (for Perl) in the meantime is to add the following line to the beginning of the script:

BEGIN { $| = 1; }

This will make sure that the operating system flushes the buffer at the end of every output operation and that guarantees that the script and the UA are synchronized. If you're using other script languages, you'll have to find the equivalent function.

This also happened to me way-back-when when I was writing DCL scripts for the OpenVMS operating system. Just goes to show that what comes around, goes around! It also demonstrates that we Tivoli types need a good background in general computer science knowledge to help us solve fiddly little problems.

-- Robert

Thursday, July 31, 2008

TADDM logs and logging tips

A good TADDM technote has been published. It lists the various logs and their meanings.
The logs are located in $COLLATION_HOME\log. I've bolded the more useful ones.

control.log - contains messages from the start script

cdm.log - web portal logs are located here

discover.log - contains messages from the Discover jini service

discover-admin.log - contains messages from the DiscoverAdmin jini service

error.log - contains serious errors for any of the TADDM/CDT services

events-core.log - contains messages from the events core jini service

l2.log - contains messages for the layer2

local-anchor.<hostname>.log - logs messages from J2EE sensors

login.log - user login audit trail

proxy.log - contains messages from the Proxy jini service

tomcat.log - contains messages from the startup sequence

topology.log - contains messages from the topology jini service

services/ApiServer.log - XML, Java and EJB interfaces to CMDB are processed here

services/ChangeManager.log - ChangeManager works with StateManager to process change events after discovery completes

services/ClientProxy.log - start here for GUI issues. The GUI talks to client proxy exclusively

services/DiscoverManager.log - contains message for Sensor and Template Matcher messages (see comment further on)

services/DiscoverObserver.log - moves completed work items from DiscoverManager to TopologyManager

services/MonitorStateManager.log - processes discovery and change events

services/ProcessFlowManager.log - event processing engine for Discovery

services/ReportsServer.log - Handles some reports tasks

services/TopologyManager.log - interface between all other components and the datastore.

services/ViewManager.log - ViewManager builds the CI navigation trees

If the setting com.collation.discover.engine.SplitSensorLog is set to TRUE in the /etc/collation.properties then each sensor will have it's own log in the directory log/sensors/<runid>/sensorName-IP.log. this is VERY useful in debugging discovery errors. If the value is set to FALSE then the DiscoveryManager will contain all the messages running together. I always make sure the SplitSensor is set to TRUE.

Here are some more log file settings which are defined in the /etc/collation.properties file

  • com.collation.log.level - Logging level. Default is INFO, I set it to DEBUG whenever I need to see why a discovery failed. At DEBUG level you'll see the precise taddmtool command which was run and the server's response.
  • com.collation.deploy.dynamic.logging.enabled - If true, you don't have to restart TADDM when you change logging settings. I always make sure this is true, the (slight) performance drop you (might) get due to the constant rechecking is worth it when you want to debug a discovery.
  • com.collation.log.filesize - Controls the maximum size of each log file. Default is 20 Megabytes. When is limit is reached, the file is renamed and a new file is started.
  • com.collation.log.filecount - Maximum number of log files created (older ones are deleted). Default is 3.

-- Robert