Photos: Inside the British Library's digital books project


See how it plans to digitise 25 million pages in just two years
The British Library is working with Microsoft and imaging company Content Conversion Specialists (CCS) on a massive book digitisation project.
Over a period of two years, around 100,000 books from the British Library's nineteenth century literature collection will be made available on its online catalogue and Microsoft's Live Search Books, and silicon.com paid a visit to the digitisation studio at the British Library in London.
When the imaging team is running at full capacity it processes one and half trolleys of books - like the one above - per day.
Photo credit: Tim Ferguson
The digitisation studio has been up and running at full steam since the beginning of September and it is hoped that it will eventually process 25 million book pages.
There are four semi-automated digital scanners (above) in the studio, which allow the team to scan about 2,400 pages per hour.
Talking about the process, Richard Helle, CCS managing director, said: "We know that we handle cultural items, treasures."
Photo credit: Tim Ferguson
To minimise the amount of book handling, the scanners turn the book pages using a device (pictured) which uses air to suck each page to it, in order to turn it over.
The scanner operator has to make sure only single pages are turned, check the images and make sure the pages don't get removed or damaged.
A digital image is taken between each page turn. Once the images have been created, their content is analysed and made searchable by a computer cluster located in the library, using 12 blade servers.
When the project has been completed, the data will take up around 30TB of storage space.
Photo credit: Tim Ferguson
Here are some pages that have been scanned and turned into a PDF. The digitised pages are fully searchable meaning researchers can find words and terms that they are interested in.
In this book of poetry a search for the word 'love' has been entered and the word in question is highlighted on the PDF document.
The British Library uses optical character recognition (OCR) technology to make the pages searchable.
Ultimately, the digitised books will be available on Microsoft's Live Search online application and made fully searchable.
Photo credit: Tim Ferguson
A scanner operator checks a book is set up correctly for the machine to continue taking images.
The screen on the right shows the images as they are scanned into the system, so the operator can check the quality of each image as it goes through.
There is a strict system of quality assurance, with images rejected if the page colours contrast too much or if the majority of text cannot be made searchable using OCR.
Photo credit: Tim Ferguson
Books sometimes have fold out sections that need to be scanned with a different scanner. The sections are flattened with a glass plate and then scanned, before being put in the correct place in the book.
Here an illustration (left) has been scanned and the image appears on the screen on the right. All of the digitised images are taken at a resolution 300dpi.
Neil Fitzgerald, the digitisation project manager, said: "We're very happy with the quality of images that are being produced."
Photo credit: Tim Ferguson