RSS

APM is Broken

Mon, May 23, 2011

Events, Featured Post, Staff Posts

…or at least it’s not delivering on its promise of improving performance

The value of Application Performance Management (APM) is perceived as “less than fair.” Over 80% of large and mid-sized organizations worldwide have made multi-million dollar investments in APM solutions with the expectation that these capabilities would reduce their production outages, quickly pinpoint the precise root causes of issues during these incidents, and speed time to resolution. In a show-of-hands at Interop 2011 during the ‘Service Delivery Management’ panel session moderated by Jim Metzler, an industry-recognized expert in network technology and business applications, NONE of the attendees agreed that APM was working well at their organizations and only 2 agreed it was performing fair, putting the remaining 70+ people in the “less than fair category.”

It’s worth repeating – millions of dollars have been invested in APM. Yet, according to NIST, four out of every five dollars of the total cost of ownership of an application are spent and directly attributable to finding and fixing problems post-deployment.

If you are not familiar with APM or how it is specifically defined, Gartner and others have outlined APM capabilities as covering these five functional dimensions:

  1. End-user experience monitoring
  2. Application runtime architecture discovery and modeling
  3. User-defined transaction profiling (also called Business Transaction Management)
  4. Application component deep-dive monitoring
  5. Application data analytics

As a panelist, along with my peers, we discussed this disconnect between expected and delivered value and speculated as to the reasons behind it. Here are a few of the observations we had.

  1. Most APM solutions focus on object-based system monitoring, with a constant stream of alerts, and a problem-based environmental picture. This results in an overwhelmed (thousands of alerts per hour) Network Operations Center (NOC) which often only has a handful of 7×24 operators.
  2. The complexity of an APM implementation into an enterprise prohibits widespread adoption of a solution across the infrastructure, application, and business unit ecosystem. When incidents happen, they can occur anywhere in the system and are often impacted through several layers of dependencies.
  3. The focus of IT management and spending has been on implementation of new hardware technologies enabling larger throughput and capacity (e.g. 40Gig devices) rather than management of application performance. This seems to be occurring because the given roles and responsibilities of IT decision-makers are aligned more with Infrastructure rather than Applications.
  4. The rise of Cloud and Mobile infrastructures is introducing new and unplanned-for performance risks. New skills and approaches to performance are required to manage the complexity of these new infrastructures.
  5. Best practices of Application Performance Engineering (APE) are only now being introduced into organizations, and a proactive approach to building performance into the entire development lifecycle, before application deployment and before the point of performance monitoring, is not yet pervasive – the need is recognized, but budget and resources are not yet aligned with the need.

We think this show-of-hands needs to be viewed as a wake-up call:
Application Performance is something we all need to recognize as a major risk, and, in parallel, we need to accelerate awareness of the importance and value of proactively mitigating application performance issues prior to production.

Business success depends on this; we don’t have the luxury of time and must act now. As one of the APM industry leaders in the session stated, “Issues with the performance of business-critical applications can cause deterioration of an organization’s business performance. Slow or not readily available applications that support key business processes can cause revenue loss, and decline in customer satisfaction, employee productivity or brand reputation.”

Please leave a reply and submit your comments below. Seeking supporting or opposing views, as we seek to move the status quo, and look to maximize the value from the APM investments through implementing complimentary APE capabilities.

Written by: todd.decapua - who has written 3 posts on Application Performance Engineering Blog – Shunra Software.

Todd DeCapua is one of the technology industry's most respected thought leaders on Application Performance Engineering and a renowned speaker, author and visionary. Mr. DeCapua's IT & QA background encompasses nearly all industries and over 70 organizations with extensive consulting experience -- before joining Shunra in 2010, he held senior leadership roles both within IT Development and IT Infrastructure. His expertise includes application development, global project management, partnership strategy, collaborative methods like Agile Scrum, infrastructure architecture, business continuity and disaster recovery. In 2009 he was invited to sit on the HP Customer Advisory Board for LoadRunner & Performance Center; in 2010 named HP Software Universe "Best & Brightest" and Vivit Worldwide Leader of the Year. He is also a certified ScrumMaster, Scrum Practitioner, and Six Sigma Green Belt; and is also accredited with an MBA, Concentration in Finance.

Contact the author

share

5 Comments For This Post

  1. Kiran Says:

    I agree with your observation,I think I have come across all those points while working with APM tools in large companies,companies buy the tool and then there are hardly anyone who knows as how to use or there just not many people having experience in it.People still believe open source fiddler traces rather than HP Diagnostics traces or LR traces ,folks still believe perfmon rather than site scope traces,folks still want to use iis log parsers to know number of time they have received 500 rather than HP Dia count in spite of call being instrumented .I have seen countless example of such instances.Its just that everyone wants to see the issue occurring in their favorite tool.Lack of education I believe could also be one of the reasons for this.

  2. William Louth Says:

    I don’t think APM is broken just the solutions and the process they promote which unfortunately relies heavily on an element that does not have “web scale” – humans.

    Software needs to be self-regulated and cost aware and the foundation for is APM but not like what has been offered up to now which is largely driven by “user” consumption needs.

    Automated Performance Management starts with Software’s Self Observation
    http://opencore.jinspired.com/?p=2709

    The Ultimate Feedback Loop
    http://opencore.jinspired.com/?p=4052

  3. Joel Deutscher Says:

    I think the biggest problem is that when you quote “multi-million dollar investments in APM solutions”, what you really mean is APM Tools.

    Application Performance Management is a process. You define how you will manage your applications performance throughout the lifecycle. This includes a threeway feedback loop from production monitoring, capacity planning and performance testing.

    People need to stop thinking about how “APM solutions” can solve the problem, and think about how people and process can solve the problem, and support them with the right tools, not just ones branded with APM.

  4. marty.brandwin Says:

    Joel,
    I agree with your comment. When we say “APM is broken”, we are using Gartner’s definition of APM which is “application performance monitoring” – the tools or solutions used in the production environment. You are correct, though – the “application performance management” process is also broken. The feedback loop you described is not often fully or correctly implemented: the disconnect between pre-production and production environments (and people) must be fixed. For example, pre-production test labs must be able to incorporate real-world production network conditions in order to ensure reliable testing (load, performance, capacity, etc.). Similarly, operations teams that are monitoring applications in production must know the SLAs that have been established and validated in pre-production so they know what a performance error is.

    Thanks for the comment and we look forward to continuing the dialog.

  5. James Says:

    As a performance engineer and person on a team to bring APM to our company I do agree that there is a disconnect between the sales pitch and delivered product. But, that is always the case with most products; look at the pictures of food on a menu.

    I tend to disagree that APM is broken; the processes, the teams and unrealistic expectations, even the sales rep. might be. We are going through a major APM implementation and are seeing plenty of benefits already. That being said it is being implemented by a team of highly technical Architects and Sr. Analysts.

    APM and APE should go hand in hand and they are not mutually exclusive. If you success is driven by your software it is foolish to ignore either. Our plan is to have APM on our QA environments as well as or production environments.

1 Trackbacks For This Post

  1. User Experience Management » Methodical Implementation of Application Performance Management Says:

    [...] http://www.shunra.com/shunrablog/index.php/2011/05/23/apm-is-broken-or-at-least-it%e2%80%99s-not-del... Share this: Categories: Application Performance Management Tags: Analysis, APM, Application Performance Management, Instrumentation, Shunra Comments (0) Trackbacks (0) Leave a comment Trackback [...]

Leave a Reply

Get Adobe Flash playerPlugin by wpburn.com wordpress themes