After a successful implementation of Red Hat Enterprise Linux and an open-source thin client, oil and gas exploration company Santos is looking towards deep analysis of big data.
Santos had been facing a problem where it had data stored in various places across the organisation, all around the globe. The amount of data continued to increase, making it difficult to ensure that everyone had the most up-to-date data, and backed up that data. Unfortunately, the distributed nature of the company's IT also resulted in multiple points of failure and poor performance for its geoscientists, responsible for finding oil and gas, using the enormous reams of data.
Finding oil and gas from data is art more than it is science, according to Santos IS subsurface manager Andy Moore. He said that the systems need to work as fast as his scientists think.
When its proprietary thin-client vendor upped its licensing costs two years ago, Santos decided to deal with its problems by moving to Red Hat Enterprise Linux 5. It also uses the TurnoVNC and Virtual GL open-source projects to enable its scientists to access 3D representations, generated by a program called Paradigm, from data housed on a central pool of servers.
The scientists were each able to see the same version of the data, collaborating over video conferencing. Not only did this end the data-consistency issues, but the combined memory of the server farm also enabled the company to analyse larger sets of data than it had previously been able to. This was due to the fact that scientists were previously likely to run analyses on chunks of data on their machines.
"You can run analytics on a dataset to reveal attributes of the data that you would not see if you were viewing individual pieces of the data," Moore said.
This enables unique insights, he said.
"What you might be working on one workstation is the left leg of the elephant. You might know that left leg really, really well, but you have no idea it's the left leg of an elephant."
Given the success of the implementation, Santos is currently testing whether it can virtualise the Windows environment on a Unix server, so that it can capture pockets of data from applications that it hasn't managed to suck into the server farm yet; for example, Petrel.
After working for about a year on some issues with 3D visualisation, due to faulty code in the virtual operating system, the company managed to make the virtualised environment work. The question is whether the virtualisation will now deliver the performance that scientists require.
"We don't know if it's production capable of supporting a wide number of users," Moore said.
The company is currently testing this, and hopes to decide by the end of June whether it's going to be viable.
"If it doesn't perform as quickly as the application does locally, we'll live with the data-management headache," he said.
With the company more easily able to access its data, it's now looking into how best to analyse it. It can do that analysis just by using its traditional techniques, according to Moore; however, Santos is also talking to "well-known companies like IBM" to see whether they could help it to perform analyses on the large sets of structured and unstructured data.
An example of what Moore hopes to achieve from data analysis is the ability to predict when compressors on an oil field will fail. As it currently stands, if a compressor fails, the company replaces it, but loses production time while the new compressor is arriving and being installed. Ideally, if data from all the compressors could be collected and compared, the company could start to predict when they might fail. This could lead to pre-emptive replacement of the compressors, avoiding a loss of production time.
"We're seriously looking at that at the moment," Moore said. "We're researching how we're going to do it."