Office 12 defaulting to .XML file format

Office 12 defaulting to .XML file format

Summary: Chris Capossela, who runs product management for the Office family of products, dropped by to see me to dribble out more details about the next version of Microsoft Office (currently dubbed "12"), which is due in the second half of 2006. The important revelation, which was expected, is that some Office 12 applications (Word, Excel and Powerpoint) will use Office Open XML as the default file format.

TOPICS: Browser

msoffice12.jpgChris Capossela, who runs product management for the Office family of products, dropped by to see me to dribble out more details about the next version of Microsoft Office (currently dubbed "12"), which is due in the second half of 2006.

The important revelation, which was expected, is that some Office 12 applications (Word, Excel and Powerpoint) will use Office Open XML as the default file format. Note: Excel and Word already have XML support and related schemas for saving documents with full fidelity as XML files. The formats are industry standard XML 1.0 and the schemas are available on a royalty-free basis. As a result, developers can query what's in a file and extract specific data or write their own compatible applications to view and manipulate the files. User can open the .XML files in any application that can read XML. "Our value is not tied to file format,  but to the user experience and quality of the software." Capossela said. Now that's a refreshing point of view, given how in the past Microsoft has often made it difficult for others to parse the file formats.

What's new for Microsoft is compacting the often overweight XML text files using industry standard Zip compression technology to compress and decompress the data within a document--including comments, charts and document metadata--that is segmented and stored in different components.  However, OLE objects and images are still stored as binaries.

Using Zip gets around the thorny issue of creating a binary XML to deal with file bloat. A few months ago I had a conversation with Jean Paoli, co-creator of the XML standard and senior director of XML architecture at Microsoft,  who told me that binary XML is "nonsense" From his viewpoint, it's not possible to create a one size fits all binary XML standard to solve all the performance and size issues. "I am not negating the problems, but it's not a matter of creating a binary," Paoli said. At that time he mentioned existing technology, such as XML-binary Optimized Packaging  (XOP) from the W3C  or using Zip. "Everybody has Zip, and XML Zips very well. For many scenarios it's good enough" Paoli said. He also projected that by 2010, 75 percent of documents would be stored in XML format.

Using XML and Zip is not a unique approach, however, given that open-source Office competitor OpenOffice (sponsored by Sun) has been using an XML-based file format and Zip compression to store files. The OpenOffice XML file format specification is maintained by an OASIS technical committee. According to a Microsoft spokesperson, has royalty-free access to the specs for the Office Open XML formats to ensure file compatibility. The current XML filter tool in OpenOffice supports the Microsoft Office 2003 XML file formats, although not always with full fidelity.

According to Capossela, users won't notice any difference with the compressing and uncompressing of files, and file size will be reduced 50 to 75 percent, resulting in savings on bandwidth and network storage.  The file formats will be backwards compatible with Office 2000 and Microsoft will have tools to bulk convert files. None of the preexisting file formats are going away either.

One of the unique benefits of .XML, beside enabling more fluid intereoperability with data and applications outside of Office,  is that the XML-based file format improves data recovery of corrupted files because it saves different types of data and puts them into discrete components. Instead of corrupting an entire file, only a part of it would be damaged. The XML formats will also help prevent executable payloads, such as viruses, from being delivered inappropriately in files.

A preview of Office 12 (not an initial beta, which isn't due until the fall) will be available at on Monday, June 6. I asked about XML file formats for Macintosh Office,  but Capossela wasn' sure--Mac Office is done by a different business group at Microsoft. Nor is a Linux version of Office on the drawing board. We'll also have to wait to hear about other features that will make it into Office 12. The dribbling continues...  

Topic: Browser

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.


Log in or register to join the discussion
  • I wonder about compatability

    M$ patented an XML/binary hybrid format last year. Why would they patent it, if they weren't going to use it? Seems to me that whatever XML format M$ SAYS it will use, my money is on M$ using something proprietary.
    Roger Ramjet
  • Internal format unimportant

    Surely MS can store the data internally however they want provided the data is always represented externally in the same way to programmers and users?

    There is therefore no real need for MS to store the data natively as XML.

    Using XML as the external logical presentation of the data is however a fundamental mistake, XML being logically a throwback to the technologically obsolescent hierarchical approaches of the sixties and seventies.

    If you put a thousand programmers in front of a thousand PCs how long would it be before they discovered the relational model by accident?
    • OpenOffice

      So is OpenOffice wrong to do the same thing, or is it only wrong when Microsoft does it?

      Carl Rapson
      • Both are mistaken

        The choice of logical representation is in both cases misguided, inflexible and unable to represent data in a consistent way.

        Microsoft have no monopoly on daft ideas.
        • Well then,

          what would be a better way to do it? It seems to me that there are two basic choices in document formats -- open or closed. Assuming open is the way to go, what would be a better format than XML?

          Carl Rapson
          • We could start by forgetting about documents

            The choice isn't between open and closed it is the choice between a good quality approach to representing data logically and a deeply flawed one (XML).
          • blah, blah, blah

            And you would suggest what? It's easy to sit and cast stones. Now stack them up for those of us who, obviously, aren't as well informed as you are.

            Like everyone using XML. Like the Fortune 500.
          • XML will fail

            The hierarchical approach embodied in XML has already been tried in the 60s and 70s and has been overwhelmingly superceded by data management techniques based on the relational model (despite SQL's inadequacies as an implementation of that model). One can already see XML trying to jump through the same hoops the hierarchical database vendors went through to try and make their products work before being overwhelmed by the prevailing tide of SQL DBMSs.

            The Fortune 500 companies that are buying this obsolete technology are doing their shareholders and customers a grave disservice.

            Of course there are those who see some advantage to shamelessly embracing and promoting every fad that comes along rather than using already proven methods. Or maybe you favour unquestioningly swallowing everything the IT industry tries to sell you? If you see making detailed reasoned criticism as casting stones then I suggest you have a problem.
  • Netscape, IE, XML, and now Office

    Wasn't there an issue if you installed Netscape 8 and breaking IE XML rendering? Would this also carry over into this new version of Office? Think of all of the users who upgrade to the new version of Office, then tried of the problems with IE, they install Netscape, and poof! Office documents can't be opened.
    • I suspect...

      ...that problem was inadvertent on Netscape's part and will be fixed before too long. The articles stated that MS would be working with Netscape to resolve it.

      Carl Rapson
      • MS working to help a competitor?

        "MS would be working with Netscape to resolve it."

        I've never seen MS work on problems with competitors products with any speed. All I've ever heard is that MS expects others to patch their own software and that MS will do nothing for free. Hell, MS won't even help customers for free!
  • Full Fidelity? Not so fast!

    Your article says "Note: Excel and Word already have XML support and related schemas for saving documents with full fidelity as XML files."

    Not true.

    When Excel 2003 saves workbooks in xml format it omits: charts, images, vba code, certain types of grouping etc.

    Fidelity is good but its certainly not full.