MS Office 2007 versus Open Office 2.2 shootout
Summary: After yesterday's blog about the relevance of feature bloat, I figured that I would follow up with some quantitative analysis on the performance characteristics to measure resource bloat. This isn't the first time I've measured Office CPU and memory consumption of Microsoft Office and Open Office.
After yesterday's blog about the relevance of feature bloat, I figured that I would follow up with some quantitative analysis on the performance characteristics to measure resource bloat. This isn't the first time I've measured Office CPU and memory consumption of Microsoft Office and Open Office. I have a whole series on it dating back to 2005. This time, I'm pitting Microsoft-backed OOXML (Office Open XML) versus the OASIS-backed ODF (OpenDocument) format with Microsoft Office 2007 and Open Office 2.2.
Before I start, I'm going to disclose the hardware, OS, and software I'm using to measure these two Office suites.
Hardware:- Intel Core 2 Duo 2.13
- 2 GB DDR2-800
- ATI X800 PCI-Express Video Card
- 500 GB SATA-II hard drive housing the sample files
- Windows Vista
- Microsoft Sysinternals Process Explorer (resource measurement)
- Microsoft Office 2007
- OpenOffice.org 2.2
| Baseline measurements for opening Application | |||||||
| Application | CPU time (milliseconds) | Memory | Number of I/O | ||||
| Kernel | User | Total | Peak KB | Read | Write | Other | |
| MS Excel | 234 | 328 | 562 | 24308 | 14 | 10 | 1422 |
| OO.o Calc | 625 | 593 | 1218 | 47788 | 364 | 12 | 13106 |
| MS Word | 171 | 390 | 562 | 31776 | 136 | 13 | 1957 |
| OO.o Writer | 343 | 687 | 1031 | 46700 | 365 | 8 | 13120 |
| PowerPoint | 250 | 343 | 593 | 27796 | 14 | 10 | 1403 |
| Impress | 484 | 843 | 1328 | 52804 | 921 | 16 | 14849 |
| MS Access | 484 | 531 | 1015 | 25836 | 12 | 9 | 967 |
| OO.o Base | 781 | 906 | 1687 | 49984 | 1708 | 176 | 22832 |
Office 2007 base memory consumption went up significantly compared to the Office 2003 I measured last year, but it's still significantly less than OpenOffice.org 2.2. Some of the OpenOffice.org applications, like Base, require Java to run, and the memory consumption spikes over 70 megabytes as soon as you start navigating in the interface. However, the difference between Microsoft and OpenOffice.org base resource consumption has gotten smaller. Next, we test the CPU and memory utilization of Microsoft Excel and OpenOffice.org Calc when opening the same 16-sheet test file.
| Opening large spreadsheet | |||||||
| Application | CPU time (milliseconds) | Memory | Number of I/O | ||||
| Kernel | User | Total | Peak KB | Read | Write | Other | |
| XLS (MS) | 265 | 2046 | 2312 | 115548 | 39 | 17 | 2376 |
| XLSX (MS) | 296 | 12406 | 12703 | 65548 | 687 | 19 | 1854 |
| ODS (OO.o) | 968 | 58875 | 59843 | 253680 | 899 | 22 | 15822 |

From these results, we can see that the OpenOffice.org ODF XML parser (while vastly improved) is still about 5 times slower than Microsoft's OOXML parser. OpenOffice.org also seems to consume nearly 4 times the amount of RAM to hold the same data. While OpenOffice.org continues to have fewer features than Microsoft Office, it continues to consume far more resources than Microsoft.
Even though these results still show drastic differences in CPU and memory consumption between MS Office 2007 and OpenOffice.org 2.2, it's not as extreme as the results measured last year. It would appear that OpenOffice.org 2.2 has gotten significantly better than version 2.0, but it still has a lot to work on. The official OpenOffice.org performance-tuning wiki is tracking some of these improvements. I praise their recent efforts and hope they keep it up because it will only bring more competition to the table. So while I may still consider OpenOffice.org a resource pig, the pig has definitely lost some weight.
Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.
Talkback
Agree but...
That is the point, with today's hardware, the average user will feel the
Meanwhile, as MS adds more and more features, tries to improve every little aspect of performance, going head to head with OpenOffice, they miss that the real competition is going to be Google Apps. They are also competing on the paradigm of formating everything for 8.5x11 paper, as that paradigm is about to be thrown on the stack heap of history.
where are the test files?
I agree...
Oh - and a workbook containing nine sheets of 16,000 rows with one formula per row isn't any more "typical" this year than it was last year.
Typical for business use
But OO is pefect for home or even SOHO use where it's unlikely you will ever come across files of that size.
This might interest you
http://redmonk.com/sogrady/2005/10/25/the-rorschach-of-ooo-analysis/
"To ?prove? that OO.o is a pig, Ou offers up a sample file here for users to perform their own tests. So far, so good. But once you get the file down, you may notice something a little bit odd: the file is 3.6 MB?s in size. That?s larger than just about any spreadsheet I?ve seen, but then go one step further and unzip the file, and you?ll discover that the content.xml portion of his sample file explodes to 279.5 MB?s."
"My brother, as a backgrounder, is I-banking trained in his usage of Excel [1] and currently employed by a relatively well known hedge fund. The files he sent over were financial models of two public companies, and essentially reflect the financial well being of the institutions in question in spreadsheet form. One spreadsheet has 7 sheets and the other 12, and the sizes? 188.5 kb and 294 kb, respectively. ..... I asked him how often he dealt with huge Excel files of the size that Ou featured, and his response was that they were very much the exception to the rule - and his is the profession that conventional wisdom at least says are the true power users of the format."
I completely concur with his statements...
I've worked with financial/accounting groups in Fortune 500 companies. :P
Again which is another reason I keep pinpointing Ou's flaw with a 200 meg file. It's completely unrealistic.. It's just being used to exaggerate a difference but when push comes to shove and a normal user uses a normal every day file.. They will hardly notice a second difference if that.
Then I guess my execel are not HUGE!
When I go on trips I use Open Office to access this file. I do not care if it uses more, or if it is even slower. I use Excel on my desktop and Open Office on my laptop.
Having a HUGE excel file is nothing new to me. At work I constantly work with excel files that are around 3 to 4 hundred megs.
I am sure that the majority of the excel files out there are a megs or less, but there are plenty of Huge Excel files out there.
Now I know I can use accounting software to do everything I am doing. I have tried and I do not like the accounting software. They do NOT do everything I need it to do.
Again..
I am impressed
Seriously, I am impressed that you got a spreadsheet to do more than the limitations of any accounting software.
I would love to see all of the formulas and macros(?) that you had to create. That must have taken a lot of work.
I am not being sarcastic. That must have been a herculean undertaking.
That is the worst way to use excel..
And there you have it
Next topic--Linux slow boot times, probably.
Yeh definally. Yeh. (Marathon Man)
er ah (Rain Man)
30MB is routine
Even 20-30 megs
And about forecasters.. I totally agree. I'm doing that sort of process now with our forecast people here. :)
One has to use a file of sufficient size ...
I don't want to dipute George's methods, motives, nor his results - I'll take his numbers at "face value".
So, when opening a 279 Mb spreadsheet, Excel could do it in a mere 2.3 seconds, while Calc took 60. That is a fairly significant difference.
However, if we can assume that there is something approaching a linear relationship between file size and the time it takes to open that file, the same two programs opening a file one-tenth the size could be expected to take 0.2 and 6 seconds respectively - plenty of difference for one to see.
But, it an even more common file size of 2.79 Mbytes were to be used, then those numbers would drop to 0.02 and 0.6 seconds - you might have trouble telling the difference. And, if a truely typical file size were tested, 279 Kb, then the 0.002 and 0.06 second times would be impossible for anyone to notice.
George's numbers are probably a fair reflection of the comparability of the two suites. So, if you deal with 100+ Mb files, you're definately going to want Office. But, if you use 100+ Mb spreadsheet files, you might also want to consider if a datebase might not be a better vehicle for storing, organizing and retrieving that much data.
Finally, some perspective
Thanks.
OO.o also provied tools to reduce filesize
Many large text documents can be logically broken into chapters. With OO.o Write, it is easy to have each document stored in a separate file and to build a Master Document that links the Subdocuments.
Finally, I can put OO.o onto a blade and have users access virtual PC's on the blade. Then, using thin clients (using whatever scavenged hardware that will support a modern monitor), I can provide many users with access to OO.o. Since OO.o will be in memory (at least after 8:02 AM), it will load fast. Since our documents are on a SAN, we can read them very fast. While it's true we could do this with Office, the licensing is (relatively) expensive and a PITA.
So, while it is true that OO.o is slower than MSO for large documents stored on a PC, OO.o offers tools and deployment options that can effectively compensate for this issue.
Besides, OO.o is evolving much faster. If OO.o were to use something like Cubework's BXML, we could see the tables turn rather quickly. See http://www.w3.org/2003/08/binary-interchange-workshop/05-cubewerx-position-w3c-bxml.pdf
Dang fast HD, doncha' think?
Nearly 280MB in 2.3 seconds?hat's over 120MB/sec. And that's just for opening the file. I'd sure like to know what hard disk he's using! Granted, 60 seconds for opening the same file (I'd assume it is?) is no speed freak(4.65MB/sec), but perhaps there's more going on than we're seeing. Maybe Excel is merely "opening" the file (i.e. getting the pointers, reading a few cells) whereas OOo is reading the whole file or perhaps the whole sheet before presenting the page. This might help account for some of the difference in observed time and observed system resource usage.
After all, since Excel is a closed, proprietary program, we've no clues as to what's actually happening. While it would be a fairly clever trick to ONLY do the calculations on those cells which are "visible" (after all, who'd know, since you can't see the others), one test for this would be to have nearly all cells depend upon data in other cells, maybe on other sheets (there's 16,000 cells x 9 sheets, right?) to see how "fast" each of them is. For example, Sheet1.A2 would contain a number that Sheet2.A2 might need in a formula, which Sheet3.A2 would need for its formula, etc. and Sheet1.A3 would need data that was calculated from Sheet9.A2, etc. but Sheet1.A1 would need data from Sheet9.A15999. Now THAT would be a test! And it wouldn't be too hard to set up, either.
Anyone want to bet that the 2.3 second "opening" gets blown away?
-R
2 seconds is for the 50 MB XLS file