The Queensland University of Technology (QUT) was having problems sorting through the massive amounts of data generated by its student and academic management system (SAMS).
SAMS carries information about courses, units, students and academic records. Authorised staffare able to go in and update information in it as the school year progresses. In 2010, the university decided to deploy Splunk, a data management software, to ease its big data woes, spending $100,000 for the software.
Splunk can monitor and analyse machine generated data, and runs on all major operating systems. Machine data can include transactions, network activities and even call records generated by websites, applications and mobile devices.
The university already had a range of data management tools, but Splunk would be able to consolidate the tools' functions into one piece of software. It was selected because, just like many core applications within an organisation, the student system required a lot of customisation, and it became hard to deal with the data coming through traditional monitoring, reporting and analytics tools. Splunk is able to collect and handle data from many different sources and was able to pull information from QUT's bespoke SAMS.
"In the past, it may have taken us potentially hours to collate, query and access the information," QUT enterprise IT systems technical leader Daniel Hili said. "Now, using Splunk across the same raw data, it's only a matter of minutes."
Initially, the deployment was for the visibility of just the student and academic management system (SAMS). Now, it is also used for QUT's active directory domain service. The university generates approximately 32GB of data per day, and the active directory service accounts for 24GB of that.
Since introducing the software, the university has found it is getting the right kind of information much faster than before. It has, according to Hili, helped IT support staff access relevant information to diagnose user authentication issues, without having to immediately escalate to the service administrators.
"Particularly for active directory, large volumes of events are generated per minute across a distributed environment," he said. "These events are typically inaccessible to all but service administrators or security analysts, and even with access, it can be challenging to trace specific events without specialised knowledge of how the service works."
QUT IT staff are now able to expose and translate the raw log data in real-time and "in a meaningful way" so issues can be addressed immediately, without involving back line support.
"We had all these logs, but couldn't get to the information in a reasonable amount of time and as efficiently as possible," Hili said. "Once we started throwing that information into Splunk, initially as a trial, it was throwing up answers to all sorts of questions."
"Some we were looking to answer and some we hadn't even thought of asking."
It is worth noting, however, that QUT does not yet have a cohesive strategy to deal with big data.
"We are developing strategies for certain data types, data classification types — more along the lines of security classification," Hili said. "But we don't have a big data strategy right now."
There is also no specific person responsible for analysing the data, with analysis being very much a team effort, according to Hili.
He has 20 people in his IT department and all of them have some engagement with Splunk at varying levels.
"Everybody has a role in data analytics," Hili said. "As administrators and IT system personnel, we all have to analyse and monitor the logs, and Splunk definitely helps with that."
It takes new users around six days to get their head around Splunk, according to Hili, who doesn't believe that there is a need for data scientists to get involved in the process of analysing data — for his department, at least.
"I know there'd be some research project that would probably benefit from the help of a data scientist, but in terms of understanding Splunk, you can pick that up pretty quickly."