Tuesday 17 April 2007

Back Again

Well back to work and 5000 + emails which brings me to today's topic - monitoring. Why does my company's monitoring system send me, on average, 500 emails on a good day (it runs to the thousands on a bad day). I fail to see the point, or the problem, when inundated with messages.

In a completely unscientific survey, I estimate that each server incident generates 17 initial messages! What's the point of that? Consequently, most go straight in the bin and none of them touch my Inbox. If I got one notification of a problem and one for when it was fixed, my 500 emails would become around 20 which would be useful and manageable. I assume the problem is that the monitoring team haven't set up the software correctly and can't be bothered as we're about to change it all anyway.

To get round this rediculous system, most operations teams quietly run their own monitoring tools which provide what they want. From a management perspective, it's horribly fragmented, undocumented etc - all the usual IT crimes - but it keeps us going. We are at least getting involved with specifying how we want the new system set up - and vetoed the "let's turn everything on and see how many messages get sent out" approach. We're still expecting three months of drowning in email alerts.

Will the new system settle down and work? I'll keep you posted.

No comments: