The British Library saves the .uk web, starting 20 years too late

The British Library and the UK's other legal deposit libraries have started archiving around 5 million websites in the .uk domain, but the past 20 years may well remain a digital black hole

A long-running scandal finally ended on Friday with the signing into law of new legislation that allows the British Library and other legal deposit libraries to archive around 5 million websites in the .uk domain. British content on other domains, such as .com and .org, will be added later.

While the legislation is to be applauded, it's two decades too late to capture the early history of web development in the UK. Massive amounts of valuable data have presumably been lost forever, and there will always be a digital black hole in British history. The consolation is that the Internet Archive, founded by American digital activist Brewster Kahle in 1996, scooped up and preserved some of it in its Wayback Machine.

The British Library has been one of the UK's copyright libraries since 1662, which means publishers have been legally obliged to give it free copy of everything they print. This has resulted in a priceless archive, albeit one that takes up 500 miles of shelf space.

It would have been logical to make the BL similarly responsible for storing copies of web-based publications as well. If it didn't feel it had the legal right, or the money, the British government should speedily have provided both.

British Library in London
The British Library in its new building near Kings Cross in London. Photo: Jack Schofield

On a personal note, I challenged librarians to start archiving the web when I opened the Libtech 96 exhibition in 1996, but my pleas fell on deaf ears.

The government eventually gave the archive its blessing in 2003. Unfortunately, the project was then delayed for another decade because, in the words of the Financial Times: "To win support for the project, the British Library had to convince the UK’s biggest publishers that the archive would not damage their commercial interests."

This has resulted in restricted access to the archive. People will only be able to read the UK's digital archive while on the premises of one of the UK's six legal depositories, and only one person will be allowed to read a document at a time. This mirrors the fact that readers had to visit a copyright library to access printed publications, but is it a sensible use of current technology?

It's perhaps surprising that digital articles are not delivered to each user on a floppy disk, preceded by a flunky carrying a red flag.

As well as the British Library, the six legal depositories comprise the Oxford and Cambridge university libraries, the National Libraries of Scotland and Wales, and the Trinity College library in Dublin.

The digital system should save money because publishers will be able to supply digital files to these copyright libraries instead of paper publications, and the digital files will also be cheaper to store and retrieve.

The BL has selected a "Curator's 100" list of websites that shows an emphasis on cultural artefacts. The list includes the Beano comic, the Unst bus shelter in Scotland, Street Art London, Rights of Women, Mumsnet, The Dracula Society, and Daily Mash, as well as more obvious government websites.

The 100 list also includes shopping and advertising sites such as Amazon, Argos, eBay, Gumtree and Trip Adviser. It also includes Facebook and Twitter.

This has prompted some alarm, because people who make stupid remarks online -- there's no shortage -- may well find their comments preserved for the amusement of future generations.

The archive of e-books and publications should be accessible in the summer, and web content from the start of 2014.


British Library goes digital



Show Comments