I think we’ve all had that moment at 3:00am on over the weekend when we wish our monitoring and management systems would just shut up. The UPS in rack 4’s battery life is at 40%, we GET it. But we’ve also had that moment when we tilt our head a bit looking at a monthly availability report with service gaps and can’t recall ever seeing a critical alert that telepresence went down on the executive wing. In both cases we thought we were covered and our monitoring systems had our back. As it turns out, that’s a more common situation than it should be, and following a few best practices can ensure you’re getting full use of your monitor as well as getting more sleep at night.
You’re not the only one
Recently, Gleanster Research completed a deep dive analysis of its recent IT monitoring systems survey to identify common issues administrators reported with monitoring their monitoring systems. The report covers the challenges of spurious alerting, incomplete systems visibility and discovery as well as escalation and issue automation. With the number of monitored applications nearly doubling in three years, nearly 90% of top performing administrators reported comprehensive application monitoring critical to their success, with the majority feeling near constant pressure to deliver additional, and every-higher performing services.
Key takeaways
The report concentrates best practice reconditions in 5 major areas:
- Comprehensive monitoring
- Ready-to-wear threshold alerting
- Useful insight based on context and coverage
- End to end visibility
- Enabling self-service
Infinite customization isn’t infinitely helpful– Just because a monitoring package can monitor everything doesn’t mean it’s great at monitoring everything. A modern application and server monitoring platform should provide easy customization to reach really systems or map critical, oddball processes in your business. But modern monitoring systems also don’t leave you tangled up in a toolbox of parts you must assemble. Out-of-the-box, it should be really good at finding and recording data with minimal input. It should know the difference between Oracle DBs, Exchange servers and Linux, and present the data in an easy to consume way.
Avoid Baslineapalooza– You’re potentially monitoring thousands, tens or hundreds of thousands of elements in your environment. Manually setting baselines is impractical and will shorten your life. Your monitoring solutions should be able to get a good rough sense of when to alert for help based on watching those systems for a few days. You should also be able to tweak any alert threshold easily as well.
Multiple dashboard blindness– The majority of organizations have at least two disconnected monitoring systems each with its own dashboard. Non-integrated dashboards are not only inefficient by creating duplicate monitories, they may leave critical monitoring lost between the cracks. Chances are one of the systems you already have are more able to create integrated views for troubleshooting than ever before. Read your manual for fun and profit (and many lower renewal cost).
Visibility into All the Things– While it’s important to ensure your core router is up, it’s not enough to make general assumptions about applications and servers. First your system likely provides a mechanism to link topology with alerts to automatically suppress messages. USE THEM. It’s the best way to science your phone and avoid 300 application monitors in a rack, when the top-of-rack switch is offline for 3 minutes of scheduled maintenance. Survey participants also indicated that broad vendor coverage is important. You manage VM’s Linux, Windows, databases and package applications. Your monitoring system should as well. Free is great on an invoice with other gear, but no good if it’s not at least a little vendor agnostic.
Admin, Heal Thyself– Self-service isn’t just for HR and payroll, it’s for IT customers too. The trouble is many tools provide ultimate power without oversight and control. And, ultimate delegated controls ultimately. Modern monitoring systems allow senior administrators to safely delegate monitoring, alerting and issue resolution to other specialists on the team closer to the action. Although sometimes lost deep in the admin guide, self-service features are often overlooked and both save your team time while empowering users. (And increasing your popularity).
Key takeaway
Perhaps in the future, precognizant systems will resolve issues within milliseconds based on faint echoes of performance data wobbling. But in the meantime, senior administrators may gain significant functionality and even peace by revisiting their monitoring systems. The best provide built-in best of breed processes contributed by administrators like you. Even better, you may already be paying for features you didn’t even know you had. And if there’s anything IT loves, it’s something free and easy.