While in no way unique in facing the challenge of how to preserve digital documents, the UK's National Archives certainly faces the issue on a larger scale than most organisations.
The National Archives (NA) is the UK government's official archive and contains around 900 years of historical material. It's not easy to put an exact figure on the amount of information the organisation is tasked with preserving, but in an average year the government produces around 150km of paper-based documents, and the archive deals with around 1.5km of this. When you include the huge amounts of digital documents that government departments are now mandated to produce where possible, and a plethora of different websites, it's fair to say that the archive's chief information officer, David Thomas, has some job on his hands.
Thomas, a former archivist who has been at the organisation for most of his career, is charged with a very important function: to oversee the IT aspects of safeguarding government documents for future generations, for historical and legal purposes. This ranges from helping with an increasing number of Freedom of Information requests, to overseeing the preservation of reports from the Bloody Sunday inquiry.
"We take government records when they are 30 years old and make them available to the public. In the electronic world, it's more recent than that. Records of royal commissions or public enquiries we take pretty much as soon as the public inquiry is released — inquiries into sunken ships, for instance," he says.
Despite a relatively low profile, the NA attracted some attention recently when it issued a joint press release with Microsoft detailing how the software maker was providing the NA with access to its Virtual PC 2007, which allows previous versions of Windows and Office to be used side by side on a single PC. This is a process known as emulation, and allows documents to be viewed on modern desktop platforms, while remaining in the same format in which they were created.
Microsoft UK managing director Gordon Frazer was keen to point out that his company was working hard to avert a "digital dark age" caused by the incompatibility of old electronic documents with new formats. However, critics would point out that Microsoft's aggressive and proprietary upgrade policy for its Windows and Office platforms are part of the problem.
An emotive issue
Rather than making Microsoft appear heroic about a problem that it had contributed to, the release highlighted a bigger issue that goes to the heart of digital preservation. As well as using the Virtual PC 2007 software to emulate older versions of Microsoft software, the NA also commented on its ongoing work to convert documents into open file formats. Mentioned as it was in a Microsoft press release, some concluded that this meant the NA planned to adopt Microsoft's Open XML document format, which has been criticised by open-source advocates for not being very open at all.
"If it were, Microsoft wouldn't need to make Novell and Xandros and Linspire sign NDAs and then write translators for them," wrote Pamela Jones, open-source expert and editor of the Groklaw blog.
When pushed on whether the NA has plans to use Microsoft's Open XML format exclusively, Thomas is keen to point out that his organisation is not tied to any one technology, but will use the best tools available to do the job at hand.
"All things are open on the table at the moment. What I think we are going to have to do is look at what is available to people on their desktop at a particular time and we will migrate to a format that they can read," he says. "Whether it's Google Docs, whether it's open document format or Microsoft Word, we will have to make judgements. The crucial thing is that the information is going to be readable using the standard tools you find on the desktop — we are not rigidly bound to one approach."
However, Thomas claims he is aware that there is a lot of support for alternative open-source formats such as Open Document Format (ODF) and the idea of not locking public documents into a format such as Open XML that is mainly championed by one vendor.
"For people involved in the debate it can be a very emotive issue — particularly the opponents of the Microsoft approach. We are neutral; we welcome open-source software because it makes our lives easier," he claims.
But although he supports an open approach to digital data formats, Thomas does not think it's his place to mandate the use of open source within the NA or the government as a whole. Critics of the Microsoft approach...