Note: It is very common for mid-level managers, such as this Regional CIO, to request anonymity. Large companies often have rules that allow only designated staff to act as public spokesmen. So, even if a mid-level manager has an interesting thing or two to say, they can't say it using either their name or their company's name. So the following customer profile does not disclose either the company or the mid-level manager's name.
From time to time, I have an opportunity to communicate with someone who is actually using a vendor's products. I find those conversations helpful and informative. I expect you will too. This time, I heard from a regional CIO of a global provider of wireless telecommunications services. Thanks for taking the time to give me some insight into what you were doing.
Please introduce yourself and your organization?
I'm the Northeast Regional CIO of a leading provider of wireless telecommunication services and a major cellphone carrier in the U.S. with more than 94 million retail customers. Our 2,000 company-owned Communications Stores, kiosks and stores-within-a-store across the country provide the vital link in bringing our products and services to new and existing mobile device customers. Reliability of these point-of-sale systems and visibility into their performance are critical success factors for us to maintain our leadership position and growing market share.
What were you doing that needed this type of technology?
My department's goal is delivering 99.95 percent system uptime. But POS outages through the first four months of 2012 had brought the IT organization very close to its annual budget for downtime minutes, with the all-important holiday season still ahead.
More and more, heading into the holiday retail season, it feels like we're Best Buy and not just a telco company. The revenues for Black Friday through the end of the year are responsible for 50 percent of the annual sales generated through these channels.
To achieve our aggressive uptime goal for our POS and account provisioning application, my organization determined that it would have to become more proactive in managing application performance and create a dedicated proactive ops team equipped with the best tools. We follow all of the functional criteria in what Gartner calls the five dimensions of application performance monitoring. The fifth dimension, IT analytics, focuses on the correlation of all the data gathered to provide visibility into patterns of application behavior, isolate issues for root-cause diagnosis, and deliver proactive notification of performance anomalies so they can be resolved before they spiral out of control and impact the business.
What products did you consider before making a selection?
We had deployed tools from leading APM vendors such as CA, and Compuware. Our own business data metrics, such as devices sold, accounts activated, services added, and more, also provide the crucial information on how well we're performing.
While we were collecting all of the data necessary for effective APM, the monitoring team had no way to make sense of it all. There was no help for understanding what constitutes normal behavior of components, infrastructure, or applications; which of the thousands of IT metrics are most meaningful in terms of business performance, or whether IT performance even has any impact on business metrics. The monitoring tools provided neither the cross-platform visibility, automated analysis, nor alarm accuracy required for proactive performance management. There actually was one occasion when an outage in the POS system covering an entire state went undetected for four hours until a business executive contacted IT to inquire if the total absence of sales activity was in fact true.
So the remaining technical challenge was to fulfill Gartner's fifth dimension of APM: to automatically analyze and correlate IT, customer experience, and business metrics in real time. This would help either to exonerate IT as a cause of reported business deviations or, when IT issues were impacting the business, allow IT to proactively alert business managers of the problem, rather than being unaware of it until called by irate users.
I expected that implementing all dimensions of APM would lead to the IT organization achieving 99.99 percent availability of the point-of-sale application, reducing mean time to repair for problems that do occur, optimizing transaction response times, and improving overall team efficiency. Achieving this IT performance would in turn support the business objectives of improving the internal and external customer experience and avoiding the cost — primarily lost revenue--of application outages.
A review of the APM marketplace suggested that Netuitive offered the only solution capable of correlating IT and business metrics as required for the missing fifth dimension of APM. So we designed a proof-of-concept pilot project to demonstrate whether Netuitive could in fact deliver value above and beyond.
What tangible benefits have you received through the use of this product?
Almost immediately, the Netuitive solution began demonstrating its value. Netuitive's composite Application Health Score and Service Health Dashboards gave our IT organization its first ever at-a-glance insight into the real-time performance of the point-of-sale retail application from a location, geographic, channel, or company-wide perspective.
Unlike conventional monitoring solutions that we use--that generate more data, graphs, and meters than anyone could ever use or understand--the Netuitive Application Performance Scorecard combines complexly related information into the simplicity and clarity of a single number, prominently displayed in the monitoring console. The higher the number, the healthier the application, and the better the business is performing. Color coding of individual elements, business KPIs, or locations in shades of green, amber, and red enables IT staff to quickly isolate a developing issue calling for proactive resolution to avoid a more serious problem.
Netuitive also enabled our IT staff to drill down from the main console for really useful cross-platform visibility into IT, customer experience, and business activity metrics, and to understand their impact on one another. As far as we know, only Netuitive provides this statistical correlation by continuously self-learning and reporting the relationship of various IT infrastructure, application, and business KPIs.
We were pleased to learn that Netuitive understands our environment's full range of normal operating characteristics and can trend expected systems, application, and business behavior. It can accurately identify and alert our staff to the leading indicators of performance issues hours before they become problems.
Can you discuss a specific incident in which this product (Netuitive) proved itself to you?
Sure. In fact, we had an interesting incident on November 30, last year, one week after Black Friday. Netuitive was monitoring both the performance of JVMs for our Point of Sales system, as well as the performance of the mainframe, which processes transactions on the back end of the PoS system. Netuitive was warning us that the mainframe was experiencing historically "abnormal" transaction rates. Unfortunately, we were not paying attention to the mainframe alerts at the time, and this eventually led to a slowdown in the performance of the application clusters responsible for the user interface or "front end" of the POS system. Our application support team was simply not used to having visibility into the mainframe that Netuitive was providing.
Netuitive helped us learn two things. First during this slowdown of the system, we were able to correlate this to a drop in gross sales in stores for the region affected. We saw this in real time for the first time and that just blew us away. Secondly, we realized that had we started to look into things when there was abnormal behavior in the mainframe that Netuitive warned us about, this whole incident could have been prevented instead of having a significant business impact, even in the short time the system was slow. This really blew me away.
Results of the three-week pilot project validated to our IT organization that automated analytics was a necessary part of its goal to achieve 99.99 percent availability of the point-of-sale application--and that Netuitive predictive analytics is the only solution that meets that requirement. Analyzing and correlating nearly 1 million KPIs from 6,000 managed elements, Netuitive proved conclusively that it can deliver the fifth dimension of APM.
What advice would you offer others facing similar circumstances?
Ask your suppliers to demonstrate that they are able to bridge the analysis gap between the IT organization's and the business key performance indicators.
Make sure that the IT organization knows what the internal business customers need. The ability to notify them in advance of IT issues demonstrates a proactive and helpful stance. They would be impressed that issues that could impact business performance are brought to light before they have a chance to cause problems.
We believe that Netuitive's technology has helped us address all five dimensions in Gartner's application performance management model. We believe that the use of Netuitive's technology assures application availability, promotes a superior customer experience, and generates a return on investment in reducing or avoiding the costs of outages.