After profiling a supplier and its products, I find it really interesting to speak with on of their customers. I've spoken with the good folks from Splunk many times (see Splunk adding end-to-end visibility to IT's tool kit for a recent example.) Recently I communicated with Peter Ehlke, Principal Systems Engineer at Pegasus Solutions. Thanks, Peter for taking the time to answer my questions.
Please introduce yourself and your organization.Pegasus Solutions is a company that you likely often encounter without ever knowing it. As the world's leading IT services provider to the travel and hospitality industry, Pegasus delivers reservations, commission and payment processing, and hotel marketing services worldwide to nearly 100,000 hotels, 1,000 travel websites and a majority of the world's travel agencies.
Pegasus empowers these organizations to accomplish the once-impossible feat of correlating room rates, inventories, payments and commissions for reservations systems, agents and hospitality chains. As a critical component in the travel and hospitality revenue engine, Pegasus demands peak performance and ultimate reliability to maintain customer satisfaction and service levels.
What are you doing that needed this technology?Pegasus processes four billion transactions per month for travel industry icons such as Marriott, The Fairmont, LaQuinta and Orbitz – making it the largest single processor of hotel transactions in the world. These marquee customers demand peak performance and ultimate reliability to maintain customers’ trust through sub-second response times and uptime of nearly 100 percent.
Due to the sheer volume, prior to Splunk, Pegasus could only manage 24 hours of transactions online and keep three days of back-up transactions on-premise. However, verifying historical guest activities such as a cancellation or confirmation are a common occurrence.
Any request for tracking a transaction older than three days required Pegasus to make an off-premise request to deliver the tape. Once on-site, Pegasus was faced with reviewing logs across 15 different systems to reconstruct and locate the request.
Peter Ehlke pointed out that "It used to take hours, even days to track transactions for some customers. Using Splunk, our support teams can typically respond to issues immediately, with real-time data and insight, while the customer is still on the phone!"
Other "discomforts" Pegasus experienced included access to log data directly on production systems, force-fitting homegrown logging formats into database-driven tools, and running customer reports in batches.
What tangible benefits has your organization gotten through the use of this product?
Operations ManagementPegasus developed several “canary-in-the-coalmine” dashboards that include all the critical metrics and telemetry to assess system performance at a glance. At the highest level, the overall system state is gauged through a traffic metric – the number of transactions per second. The operations team also monitors user response times to detect server performance issues or network issues.
At the next level, the dashboard shows an average response time aggregated across all transaction types as well as the standard deviation, so they know if the whole system is sluggish or if it’s only a few outlier transactions.
They get a further breakdown through more scripted inputs that gather and display the execution time for each transaction type. Scripting that reaches deeper into the system for the number of transactions in the message queue provides early warning. If this number starts to rise, Pegasus knows a problem is brewing.
Customer ServiceWith Splunk, the support team can typically respond to customers while they're still on the phone. A year’s worth of transactions available online reduces inquiry-tracking time from nearly 40 hours a week to minutes—freeing up one full-time employee and dramatically improving customer service.
Pegasus also plans to use Splunk to offer proactive and immediate customer service. Rather than relying on scheduled reports, Pegasus plans to offer portlets where customers can view their data in real-time. So, if customers want to measure the impact of a promotion, they can see results immediately—and change offers dynamically based on a real-time analysis of results.
Application TroubleshootingSplunk was adopted as a standard by developers, QA and operations during the rollout of their state-of-the-art reservation system.
- Developers securely access error logs via Splunk, so they no longer log into the live environment.
- Compliance is assured by mapping Splunk accounts to Active Directory, so administrators only have to manage permissions in one place. User data are masked at index time to mitigate security risk and ensure user data remains private.
- Splunk measures how various configurations affect performance.
- Splunk helps developers fine tune performance for production, helping them analyze and detect performance anomalies right down to the particular code segment causing it.