Coles Online has been live since 1998, and has been running on its current IT platform since 2008. Yet, despite having installed "heaps of monitoring tools" in the past, staff members were finding it ponderous to find the information they needed.
"Generating new reports and additional metrics usually required some specialised skills," Coles Online web ecommerce manager Merric Reese told this week's SplunkLive conference in Melbourne. "The tools were very good at providing specific information about the component they were monitoring, but I had to sift through multiple tools and interfaces to find what I was looking for.
"Trying to get an overall view of system performance in real time was extremely challenging. The time required to create new dashboards, or update them, was usually measured in days, and sometimes even in weeks."
Early experiments with the Splunk web analytics tool quickly breathed new energy into the process of monitoring performance, with some tentative new dashboards able to collate data from multiple systems and present it through a series of interactive dashboards.
As a monitoring and diagnostic tool, the new environment provided a one-stop shop for Reese and his team — but it was the cultural change it engendered that got him most excited.
"Being able to do a couple of searches, and get the answer myself in about five minutes, was truly awesome," he said. "The team got really busy creating new dashboards, I got an understanding of the syntax within a couple of hours, and after that there was no stopping us."
The dashboards comb through around 3GB per day of new performance data.
Better visibility of site metrics even proved to be useful in security monitoring and picking up on behavioural anomalies, with real-time visibility of site errors making it easy to spot "a huge number of errors" that were coming into the site. Queries were requesting pages that didn't exist, and after watching the behaviour for some time, Reese twigged to what was going on.
"After doing some particular searches and mapping customer transactions through the process they were doing, it was pretty clear and quite easy to recognise that users were trying to scrape data from the Coles Online website," he said. "We needed a solution to try and prevent this."
Reese set up a regular alert that would email him whenever such behaviour was detected, but this became ponderous, as regular alerts needed to be tended to manually. The development team built a number of scripts that would automatically scan access logs and actively block users who were engaging in that type of suspicious activity. Scripts also track block IP addresses to ensure they are released after a period of time.
By surfacing its performance metrics, Coles Online has been able to turn necessary site monitoring from a chore into something that is proactively helping Reese and his team manage the performance of the site — so much so that the team has been championing its analytics dashboards throughout the Coles Group, and is starting to apply the system to its picking system. Reese expects that additional systems will push the amount of data being processed toward 10GB per day.
"We're getting some great insights out of that, and we're starting to push those reports and data to our business users for store leader boards and those kinds of things," he said. "We've learned that the key is not to let your data be locked away. Even though there are gigabytes of it being generated every day, having a tool that you can use to explore your data in real time is very empowering."