/>
X

Photos: Inside the British Library's digital books project

See how it plans to digitise 25 million pages in just two years
By Tim Ferguson, Contributor on
40150787-1-bl8.jpg
1 of 6 Tim Ferguson/ZDNET

See how it plans to digitise 25 million pages in just two years

The British Library is working with Microsoft and imaging company Content Conversion Specialists (CCS) on a massive book digitisation project.

Over a period of two years, around 100,000 books from the British Library's nineteenth century literature collection will be made available on its online catalogue and Microsoft's Live Search Books, and silicon.com paid a visit to the digitisation studio at the British Library in London.

When the imaging team is running at full capacity it processes one and half trolleys of books - like the one above - per day.

Photo credit: Tim Ferguson

40150787-2-bl9.jpg
2 of 6 Tim Ferguson/ZDNET

The digitisation studio has been up and running at full steam since the beginning of September and it is hoped that it will eventually process 25 million book pages.

There are four semi-automated digital scanners (above) in the studio, which allow the team to scan about 2,400 pages per hour.

Talking about the process, Richard Helle, CCS managing director, said: "We know that we handle cultural items, treasures."

Photo credit: Tim Ferguson

40150787-3-bl4-1.jpg
3 of 6 Tim Ferguson/ZDNET

To minimise the amount of book handling, the scanners turn the book pages using a device (pictured) which uses air to suck each page to it, in order to turn it over.

The scanner operator has to make sure only single pages are turned, check the images and make sure the pages don't get removed or damaged.

A digital image is taken between each page turn. Once the images have been created, their content is analysed and made searchable by a computer cluster located in the library, using 12 blade servers.

When the project has been completed, the data will take up around 30TB of storage space.

Photo credit: Tim Ferguson

40150787-4-bl10.jpg
4 of 6 Tim Ferguson/ZDNET

Here are some pages that have been scanned and turned into a PDF. The digitised pages are fully searchable meaning researchers can find words and terms that they are interested in.

In this book of poetry a search for the word 'love' has been entered and the word in question is highlighted on the PDF document.

The British Library uses optical character recognition (OCR) technology to make the pages searchable.

Ultimately, the digitised books will be available on Microsoft's Live Search online application and made fully searchable.

Photo credit: Tim Ferguson

40150787-5-bl7.jpg
5 of 6 Tim Ferguson/ZDNET

A scanner operator checks a book is set up correctly for the machine to continue taking images.

The screen on the right shows the images as they are scanned into the system, so the operator can check the quality of each image as it goes through.

There is a strict system of quality assurance, with images rejected if the page colours contrast too much or if the majority of text cannot be made searchable using OCR.

Photo credit: Tim Ferguson

40150787-6-bl6.jpg
6 of 6 Tim Ferguson/ZDNET

Books sometimes have fold out sections that need to be scanned with a different scanner. The sections are flattened with a glass plate and then scanned, before being put in the correct place in the book.

Here an illustration (left) has been scanned and the image appears on the screen on the right. All of the digitised images are taken at a resolution 300dpi.

Neil Fitzgerald, the digitisation project manager, said: "We're very happy with the quality of images that are being produced."

Photo credit: Tim Ferguson

Related Galleries

Holiday wallpaper for your phone: Christmas, Hanukkah, New Year's, and winter scenes
Holiday lights in Central Park background

Related Galleries

Holiday wallpaper for your phone: Christmas, Hanukkah, New Year's, and winter scenes

21 Photos
Winter backgrounds for your next virtual meeting
Wooden lodge in pine forest with heavy snow reflection on Lake O'hara at Yoho national park

Related Galleries

Winter backgrounds for your next virtual meeting

21 Photos
Holiday backgrounds for Zoom: Christmas cheer, New Year's Eve, Hanukkah and winter scenes
3D Rendering Christmas interior

Related Galleries

Holiday backgrounds for Zoom: Christmas cheer, New Year's Eve, Hanukkah and winter scenes

21 Photos
Hyundai Ioniq 5 and Kia EV6: Electric vehicle extravaganza
img-8825

Related Galleries

Hyundai Ioniq 5 and Kia EV6: Electric vehicle extravaganza

26 Photos
A weekend with Google's Chrome OS Flex
img-9792-2

Related Galleries

A weekend with Google's Chrome OS Flex

22 Photos
Cybersecurity flaws, customer experiences, smartphone losses, and more: ZDNet's research roundup
shutterstock-1024665187.jpg

Related Galleries

Cybersecurity flaws, customer experiences, smartphone losses, and more: ZDNet's research roundup

8 Photos
Inside a fake $20 '16TB external M.2 SSD'
Full of promises!

Related Galleries

Inside a fake $20 '16TB external M.2 SSD'

8 Photos