Linux has finally made it onto the business map in the area of database benchmarks, helping take the wind out of Microsoft Corp.'s continued contention that open-source operating systems don't make good business sense.
While benchmark wars are commonplace in the database space, until the latest TPC-H numbers were released Microsoft had come to largely dominate the TPC numbers with a combination of SQL Server 2000 running on Windows 2000 across a variety of hardware platforms.
The Transaction Processing Performance Council released new data this week showing that IBM's upcoming DB2 7.2 release running on Linux 2.4.3 outperforms SQL Server 2000 running on Windows 2000 in the 100GB category.
The winning entry was clocked on an SGI 1450 server. DB2 7.2 is due to ship on June 8, while the SGI-Linux-IBM combination system is expected to be commercially available from October 31.
The TPC-H is a decision support benchmark. According to the TPC Web site, TPC-H "illustrates decision support systems that examine large volumes of data, execute queries with a high degree of complexity and give answers to critical business questions."
The performance metric reported by TPC-H is called the TPC-H Composite Query-per-Hour Performance Metric (QphH@Size), and it reflects multiple aspects of the capability of the system to process queries. The winning SGI-Linux-IBM entry clocked 2733 QphH compared to the 1699QphH of the second-place SQL Server 2000 running on Windows 2000.
Big Blue answers critics
IBM officials said that the company submitted the benchmark for several reasons.
"We are interested in helping bring Linux into the mainstream. There has been no TPC benchmark published on Linux to date," said Berni Schiefer, IBM distinguished engineer and manager of performance and advanced technology with the data management solutions group. "No TPC-C, TPC-H, TPC-R or TPC-W."
At the same time, IBM wanted to answer critics who still maintain that DB2 runs only on IBM hardware, Schiefer added. "We have established DB2 as a multivendor, multiplatform database."
The DB2 on Linux configuration is running on a four-node/four-server-per-node SGI server that is currently shipping. But SGI is bringing some new features to its Linux release via its Pro Pack add-on, and IBM is adding some additional features to DB2 7.2 designed to make Linux more scalable, IBM officials said. For example, IBM is adding a vectored-read facility to DB2 that will enable large reads into buffer pools -- a feature aimed at improving data-warehousing support, Schiefer said.
While the DB2 on Linux TPC-H configuration isn't the least expensive of the top 10 TPC entries, IBM was tuning primarily for performance, not price/performance with this benchmark, officials said. Although TPC council rules forbid vendors from projecting about future TPC submissions, IBM will likely make tweaks that improve the overall price/performance ratios of DB2 on Linux going forward, officials confirmed.
Jeff Ressler, lead product manager for Microsoft's SQL Server team said it was "interesting to finally see someone else finally playing in the 100GB category, which we have dominated with no competition for some time."
While he accepted the results, the tests did not compare "apples to apples in the sense that the amount of computing horsepower there is not the same," Ressler said. "You do have to take into consideration things like the management requirements around a cluster of four machines versus a single machine."
Ressler also downplayed the Linux role in the benchmark, saying it was important also to remember that this was a database test and not an operating system evaluation. While benchmarks are important as customers look at them as a validation that a product will perform and scale, the real validation comes from experience in the field with running mission-critical systems, he said.
"And that's something Linux doesn't have -- a legacy of accountability like SQL Server or Oracle or DB 2 have," Ressler added. "DB2 is an established database but is here running on an operating system that is a relatively new player," he said.
The winning entry also used twice as many processors as the second-place SQL Server 2000 running on Windows 2000. It was also not competitive on a price/performance basis.
"It remains to be seen how beneficial this will be for SGI or Red Hat or IBM for that matter," Ressler said. "Yes, it's got a good performance number but theprice/performance is just not there.
"They are also using four machines rather than a single machine -- which means four times as much to manage -- as well as twice as many processors. We're also 25 percent more efficient than them for CPU," he said.
Going forward, Microsoft is activelyworking on a variety of benchmarks.
"You always want to be number one but having someone else playing in the 100GB space is a validation of its importance," Ressler said. "We're always looking for better performance, and you can expect to see further results from us in the future," he said.
An interesting footnote: Oracle wasn't on the list.
"They seem to have decided not to play in the space where customers are actually playing," Ressler said.