…or at least it’s not delivering on its promise of improving performance
The value of Application Performance Management (APM) is perceived as “less than fair.” Over 80% of large and mid-sized organizations worldwide have made multi-million dollar investments in APM solutions with the expectation that these capabilities would reduce their production outages, quickly pinpoint the precise root causes of issues during these incidents, and speed time to resolution. In a show-of-hands at Interop 2011 during the ‘Service Delivery Management’ panel session moderated by Jim Metzler, an industry-recognized expert in network technology and business applications, NONE of the attendees agreed that APM was working well at their organizations and only 2 agreed it was performing fair, putting the remaining 70+ people in the “less than fair category.”
It’s worth repeating – millions of dollars have been invested in APM. Yet, according to NIST, four out of every five dollars of the total cost of ownership of an application are spent and directly attributable to finding and fixing problems post-deployment.
If you are not familiar with APM or how it is specifically defined, Gartner and others have outlined APM capabilities as covering these five functional dimensions:
- End-user experience monitoring
- Application runtime architecture discovery and modeling
- User-defined transaction profiling (also called Business Transaction Management)
- Application component deep-dive monitoring
- Application data analytics
As a panelist, along with my peers, we discussed this disconnect between expected and delivered value and speculated as to the reasons behind it. Here are a few of the observations we had.
- Most APM solutions focus on object-based system monitoring, with a constant stream of alerts, and a problem-based environmental picture. This results in an overwhelmed (thousands of alerts per hour) Network Operations Center (NOC) which often only has a handful of 7×24 operators.
- The complexity of an APM implementation into an enterprise prohibits widespread adoption of a solution across the infrastructure, application, and business unit ecosystem. When incidents happen, they can occur anywhere in the system and are often impacted through several layers of dependencies.
- The focus of IT management and spending has been on implementation of new hardware technologies enabling larger throughput and capacity (e.g. 40Gig devices) rather than management of application performance. This seems to be occurring because the given roles and responsibilities of IT decision-makers are aligned more with Infrastructure rather than Applications.
- The rise of Cloud and Mobile infrastructures is introducing new and unplanned-for performance risks. New skills and approaches to performance are required to manage the complexity of these new infrastructures.
- Best practices of Application Performance Engineering (APE) are only now being introduced into organizations, and a proactive approach to building performance into the entire development lifecycle, before application deployment and before the point of performance monitoring, is not yet pervasive – the need is recognized, but budget and resources are not yet aligned with the need.
We think this show-of-hands needs to be viewed as a wake-up call:
Application Performance is something we all need to recognize as a major risk, and, in parallel, we need to accelerate awareness of the importance and value of proactively mitigating application performance issues prior to production.
Business success depends on this; we don’t have the luxury of time and must act now. As one of the APM industry leaders in the session stated, “Issues with the performance of business-critical applications can cause deterioration of an organization’s business performance. Slow or not readily available applications that support key business processes can cause revenue loss, and decline in customer satisfaction, employee productivity or brand reputation.”
Please leave a reply and submit your comments below. Seeking supporting or opposing views, as we seek to move the status quo, and look to maximize the value from the APM investments through implementing complimentary APE capabilities.

May 25th, 2011 at 20:50
I agree with your observation,I think I have come across all those points while working with APM tools in large companies,companies buy the tool and then there are hardly anyone who knows as how to use or there just not many people having experience in it.People still believe open source fiddler traces rather than HP Diagnostics traces or LR traces ,folks still believe perfmon rather than site scope traces,folks still want to use iis log parsers to know number of time they have received 500 rather than HP Dia count in spite of call being instrumented .I have seen countless example of such instances.Its just that everyone wants to see the issue occurring in their favorite tool.Lack of education I believe could also be one of the reasons for this.
July 29th, 2011 at 00:40
I don’t think APM is broken just the solutions and the process they promote which unfortunately relies heavily on an element that does not have “web scale” – humans.
Software needs to be self-regulated and cost aware and the foundation for is APM but not like what has been offered up to now which is largely driven by “user” consumption needs.
Automated Performance Management starts with Software’s Self Observation
http://opencore.jinspired.com/?p=2709
The Ultimate Feedback Loop
http://opencore.jinspired.com/?p=4052
September 13th, 2011 at 20:34
I think the biggest problem is that when you quote “multi-million dollar investments in APM solutions”, what you really mean is APM Tools.
Application Performance Management is a process. You define how you will manage your applications performance throughout the lifecycle. This includes a threeway feedback loop from production monitoring, capacity planning and performance testing.
People need to stop thinking about how “APM solutions” can solve the problem, and think about how people and process can solve the problem, and support them with the right tools, not just ones branded with APM.
September 14th, 2011 at 12:40
Joel,
I agree with your comment. When we say “APM is broken”, we are using Gartner’s definition of APM which is “application performance monitoring” – the tools or solutions used in the production environment. You are correct, though – the “application performance management” process is also broken. The feedback loop you described is not often fully or correctly implemented: the disconnect between pre-production and production environments (and people) must be fixed. For example, pre-production test labs must be able to incorporate real-world production network conditions in order to ensure reliable testing (load, performance, capacity, etc.). Similarly, operations teams that are monitoring applications in production must know the SLAs that have been established and validated in pre-production so they know what a performance error is.
Thanks for the comment and we look forward to continuing the dialog.
November 3rd, 2011 at 08:49
As a performance engineer and person on a team to bring APM to our company I do agree that there is a disconnect between the sales pitch and delivered product. But, that is always the case with most products; look at the pictures of food on a menu.
I tend to disagree that APM is broken; the processes, the teams and unrealistic expectations, even the sales rep. might be. We are going through a major APM implementation and are seeing plenty of benefits already. That being said it is being implemented by a team of highly technical Architects and Sr. Analysts.
APM and APE should go hand in hand and they are not mutually exclusive. If you success is driven by your software it is foolish to ignore either. Our plan is to have APM on our QA environments as well as or production environments.