What's in Your Performance Monitoring Strategy?
When an organization is looking to improve application and service delivery, consolidate existing performance monitoring tools and responsibilities or justify the impact of a new technology deployment, there are a few key components that serve as fundamental building blocks for an effective performance monitoring strategy.
Breaking your strategy into these components improves comprehension while articulating and reaching consensus on the performance monitoring requirements for your business, especially in an environment where cloud, Internet of Things and software-defined everything are gaining significant momentum.
This first component is collection. Any performance monitoring strategy starts with data collection. If you can’t monitor it, you can’t manage it. To prevent visibility gaps, your performance-monitoring platform should be data agnostic, with high frequency polling down to the second. Of course, granular data collection is only useful when you can maintain that data for a sufficient timeframe, so be sure you can maintain as polled data for accurate capacity forecasts. Applications, systems and network devices produce massive volumes of machine data with cloud and virtualization only adding to the issue. If your monitoring platform can’t scale with your data collection and reporting needs, you’ll end up with significant visibility gaps over your infrastructure performance.
Building the baseline. Once you’ve collected the broadest set of performance data at the required granularity, it’s time to establish a baseline for every metric you monitor. It’s imperative to understand what “normal” conditions look like at any given moment, especially in dynamic virtualized environments. Baselines then become your basis for an effective alerting method.
Setting alerts. In addition to setting static thresholds, it’s important to establish alerts based on deviation from baseline performance. Beyond a daily alert about high bandwidth usage, you need to know when an unexpected spike occurs during working hours due to a unique user-initiated action. You should be able to specify how many standard deviations you consider acceptable for any metric. This requires an understanding of baseline historical performance for all metrics monitored. This method provides a more reliable predictor of service-impacting events and helps reduce false positive alerts.
Creating reports. Canned reports reveal most utilized interfaces, highest packet loss and other key metrics. Yet, they don’t allow for the level of manipulation often required to troubleshoot performance issues. You need the ability to graph any time series metrics on a single screen or report to help correlate the cause of service degradation. You also need to understand how increases to the number of objects you monitor impact the speed of your reporting platform. Performance monitoring solutions that rely on a centralized database architecture suffer significant degradation to reporting speed as your monitored domain expands. It’s always best to maintain information in a distributed fashion and have the system query the data when needed. Reports that fail at providing near real-time information are unacceptable.
Analyzing data. The goal is to find actionable insight needed to proactively detect and avoid performance events, understand correlations that can help fine-tune infrastructure and make more informed forecasting decisions about the impact infrastructure has on the business. The key to properly analyzing performance data is to have all the data in one place. That means accessing metric, flow, and log data from a single platform to avoid “swivel chair” analysis.
Sharing results. Once armed with the strategic ability to collect, baseline, alert, report and analyze your performance data, its time to share insights with team members who can truly benefit from monitoring results. This requires knowing your audience. For instance, a CTO is most interested in a service-level or even market-level views of performance. Sharing information also means sharing data with other platforms, such as a fault or configuration management solutions. It should be just as easy to export data as it is to ingest it
Remember to focus on refining the process to eliminate wasted time and energy. Breaking down your performance monitoring process into components provides clarity around your strategy. By understanding the core requirements of a monitoring strategy, you also arm yourself with the knowledge to make an informed buying decision when evaluating performance monitoring vendors.
For more information, download this white paper on 6 Steps to an Effective Performance Monitoring Strategy.