Arthur Cole spoke with John Newsom, vice president/ general manager of application management, Quest Software.
Cole: Now that virtual infrastructure has taken hold in most data centers, many IT executives are finding that application management is more difficult than in the days of physical servers. What are some of the ways they can streamline their application environments in the virtual world?
Newsom: Virtualization presents two key challenges as it applies to application management -- understanding the impact of resource sharing and ensuring that adequate resources are provided to support existing and new application workloads. The key to streamlining application environments is to first determine how the four core resources -- CPU, memory, disk and network -- support applications in the context of meeting performance, availability and service level objectives. Next, organizations need to understand the relationships and interactions between all the components in the virtual infrastructure and how the applications leverage them. Finally, with these dependencies understood, IT teams need to monitor the performance of each component supporting the application while correlating that data so that it's consumable by service owners and others in IT management.
Cole: Many times, though, problems remain hidden until service levels or availability are actually compromised. Are there typically any warning signs that all is not as it should be?
Newsom: The best chance for reliable early warning signs comes from end-user experience monitoring -- a combination of both real and synthetic. Transactions initiated by an individual user or a particular region of users will surface long before an SLA is breached. The threshold on end-user activity can be more sensitive and catch things earlier than an aggregate SLA would.
In parallel, you need visibility into the application infrastructure, including application, Web, message queue and database layers, as well as the hypervisor. Visibility into these various perspectives at the same time and based on the same data is critical.
Lastly, you need the ability to create guaranteed connections between all of the infrastructure's components and layers so you can determine if the early warning signs are tied to a critical service or if the problematic component is an element of a non-critical service. If you are chasing non-critical issues simply because they surface without any prioritization, you can lose sight of, and the ability to triage, incidents tied to critical services. On the flip side, early warning signs in the physical world may be false alarms in the virtual world. Thresholds on virtualized resources are expected to increase as resource utilization is maximized. If the same threshold for a physical resource consumption limit is used to trigger a triage situation, you will end up chasing what are likely false warning signs and potentially constrain yourself to respond to the real ones.
Cole: Does application management in virtual environments require an entirely new approach, then? Or are there any tools and techniques than can be brought over from the physical environment?
Newsom: Fundamentally, it's a similar approach because you are still answering the same questions: Is an SLA being violated? Are end users being impacted? How many? For how long? What is the root cause of the issue? Who on the team needs to fix it? However, the tools to address these questions now must change for two reasons.
First, the past paradigm was based on resource utilization. It was once bad to have resources in the physical world eclipse 80 or 85 percent -- a threshold would be passed and an alert would be fired off to an administrator. However, resource utilization is the very goal of virtualization and it's expected to run at very high percentages. In addition to collecting and reporting on resource utilization metrics, tools must look to transaction response times and end-user service levels.
This leads to the second fundamental change needed in tooling. Since end users, transactions and services are inherently crossing the boundaries of technologies and IT teams, the siloed nature of tooling and monitoring must end. Teams need a management system that is based on a model of the service and its transactions, not of the physical layout of teams or technology domains. This system must empower the individual team while maintaining a view into the overall transaction and service as an end user consumes it.