Cloud vendors will decide when GPUs and big data meet

Few companies have the sheer amount of data and processing needed to buy racks of GPUs, which leaves cloud vendors as the vehicle for bringing the GPU to big data.

Progress on the central processing unit has been restricted to adding more cores, getting more performance per watt, and addressing increasing amounts of memory in the recent past.

While having 24 cores and addressing 24TB of RAM is certainly an improvement, the sort of raw performance jumps that look nice on benchmarking charts hasn't been there.

Meanwhile, in the graphics world, the sort of advancing performance that occurred in computing a decade ago is still happening. Nvidia's latest 1080 chip announcement almost hit the mark of twice the performance for half the cost of its Titan X predecessor.

Given such improvements, and the theoretical amounts of computation power available on the GPU, it is little wonder that Nvidia is talking up its data analytics credentials.

"Whether you're Monash University, a big telco, Google, in health, a researcher -- it's got to be GPU," Mark Patane, Nvidia ANZ country manager, said earlier this week.

"The fundamental difference is that you can try and run data on non-GPU systems, but as soon as you hit the enter key and it starts doing algorithms, that spinning wheel of death can take weeks before it comes back with an answer. With a GPU, you can do it a lot faster and the response will be in a matter of minutes."

If the GPU is as good as Patane claims it is, where are all the GPU-powered big data systems?

According to Dr Rod Fontecilla, vice president for Advanced Data Analytics at Unisys Federal, the current usage of relying on the CPU to handle computation remains good enough.

"We haven't seen the need, necessarily, to bring data into the GPUs," Fontecilla told ZDNet.

"Essentially, the computational power we get out of the Spark and our Spark cluster is significantly high that we haven't seen the need to go to changes in the architecture."

For Fontecilla, it is not raw computational power that is the main concern, but instead the analytical model used.

"We've been more interested on the predictive models and being able to solve the business problem," he said. "We are worrying more, when we look at advanced analytics, at the precision, how good these models are to solve the business problem.

"If they take a few more seconds to give me the solution, we are not talking about real-time here, we are talking about near real-time, so we don't see a significant impact in delaying some of the results."

Unsurprisingly, Intel thinks its x86 world will see off the GPU challenge.

"Most customers will tell you that a GPU becomes a one-off environment that they need to code and program against, whereas they are running millions of Xeons in their datacentre, and the more they can use single instruction set, single operating system, single operating environment for all of their workloads, the better the performance of lower total cost of operation," Diane Bryant, Intel executive vice president and general manager of its Data Center Group, recently told ZDNet.

"We already have the in-memory analytics world, we already have the deep scale-out world, we have 90 percent share of the server market, and machine learning [is] the next algorithm to be accelerated. It makes sense to do it on a [familiar] platform."

Bryant said that the usage of GPUs for big data is currently sitting at less than 1 percent of the market, and cloud providers are on the Xeon bandwagon in large numbers.

"Over 90 percent of all cloud service providers' servers are two-socket Xeons," she said.

This statistic cuts both ways. It is not only a strength, but also a fundamental weakness, and shows where much of the power rests in determining what workloads are processed where nowadays -- with cloud providers.

In the same manner that Intel can more easily transition the industry to in-memory computing thanks to the large number of users on relatively few cloud services, so too can those users migrate to a world of GPU computation quite quickly.

It's something that Fontecilla said Unisys is keeping an eye on.

"We're always beefing up our platform, and definitely we are looking at the advances of GPUs and new offerings on AWS using GPUs to do computation, so definitely we are open, but not necessarily doing anything just now," he said.

While those dealing with exabytes of data have a good reason to tackle GPU computation today, for everyone else, it's a case of waiting for the cloud provider of choice to make it affordable and easy enough to use before making the switch.

Intel may be laughing now, but it could quickly find the rug pulled out from underneath it if cloud vendors so choose.

So goes the issues of procuring, testing, and maintenance in a world of leased infrastructure.

Disclosure: Chris Duckett attended Computex as a guest of Intel.