Cut. Scan. Read.

Cut. Scan. Read.

Summary: Google books has nothing on me: for months now I've been cutting the pages out of old books, scanning them into PDFs, storing them on a hard drive and recycling the paper. OCR makes the text searchable and selectable.

TOPICS: Hardware

Google books has nothing on me: for months now I've been cutting the pages out of old books, scanning them into PDFs, storing them on a hard drive and recycling the paper. OCR makes the text searchable and selectable.

Most important: I'll never schlep them up a flight of stairs again.

Ripping books. Literally. I have a sheet-fed scanner - a Fujitsu Scan Snap S510M - which works quickly. It handles about 20 sheets per minute, scanning both sides. A 200 page book takes about 5 minutes to scan.

The problem is turning a bound book into sheets. I've been using a utility knife to cut the pages, but I'm hoping to find something quicker - and smaller than a bandsaw.

But the knife only takes a few minutes. In less than 10 minutes I can reduce a bulky 2-3 pound book to a weightless file with all the typography, graphics and even the paper's color preserved in a PDF.

A target rich environment I still have books from college and grad school, such as Mendenhall's Understanding Statistics, the most practical stat book I've found. Having it on my computer rather than upstairs on a shelf makes it much more usable.

Many books get occasional reference use, and they are all top candidates for the cut/scan/read treatment. Hardcover books also can be cut up, scanned and reglued, or so it is claimed.

There are non-destructive scanners but my point is to get rid of the book after it is scanned. For college students the point is to avoid paying high textbook prices.

I'm not saying they should have scanned those books, but I understand.

The Storage Bits take Buckminster Fuller used to talk about the ephemeralization of the physical, the logical outcome of "doing more with less." We are still caught up in the physicality of media, from the 10 Commandment’s stone tablets to the soothing solidity of a book-lined wall.

But the downloading iTunes generation has moved past that. You want a greener society? Stop cutting down trees and lugging books around.

Me? I just want to save my back the next time I move.

Comments welcome, of course.

Topic: Hardware

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.


Log in or register to join the discussion
  • i dunno man.

    there's something about destroying books that makes me queasy. i know you're not destroying the information... but it's a visceral response, i guess.
    • On the plus side I'm learning about book binding.

      Me too. But now that I've done it a few times I'm starting to enjoy it.


      Robin Harris
      • Books about Bookbinding

        If you want to learn about bookbinding, here is a great collection of books from the Cary Library located at Rochester Institute of Technology. This site allows you to order high quality re-prints of these great books. There are 656 books all dealing with the art of Bookbinding.

    • Oh my God... killing books...

      Oh my God... killing books...

      Yeah, I had that weird feeling while reading this article. Information was preserved, but the book... there is something about the paper I would not dare to destroy. (Of course there are books I would not worry about.)



  • My biggest issue

    Right now when I fly, or when the power goes out, I have a book in my lap, reading comfortably by natural light. Sometimes I'm even crazy enough to bring a book with me into the bathroom and read while I soak in the tub. PDF versions wouldn't quite cut it in those environments.
    • Why not? eInk is easy to read by natural light

      Any eReader that has an eInk screen uses no power to display the screen (except when you turn the page) and can easily be read in natural light. Now all you need is a waterproof reader - grin.
  • RE: Cut. Scan. Read.

    I am one of those college text book authors. I spent years acquiring the knowledge and skills to write the text. Then nearly 200 hours in the first draft, then spent money for someone to review the text and then the hours in dealing with the publisher. For that trouble I get a whopping 6% of the wholesale price. You take my work after buying it and use it for personal reference. Fine. Destroy the book and keep it for your personal reference. Fine. But give it away to others in electronic format. You are stealing. Plain and simple.

    Gee, thanks. I think I will write another. Not!
    • I don't condone it . . .

      But with a senior in college I understand it. Much of the issue is
      because publishers built their business model around putting ink on
      paper and shipping boxes. We don't need that as much now - so there
      goes most of their value add.

      Ultimately, the value chain between the producer - you - and the
      consumer - students - needs to get much shorter. Prices could drop
      80% and you'd still do as well as you are now - maybe better.

      But don't expect publishers to see it that way.

      Robin Harris
    • We are sorry for your loss

      But the fact that you are bought into a failing business model doesn't stop the march of progress. The fact that you helped pay for the shiny new corporate headquarters of somebody like Houghton Mifflin, McGraw-Hill, or HB&J while you are left with pennies should hopefully encourage you to look at alternative publishing methods.

      The ideas of public libraries, book rentals, and used bookstores have all been attacked by authors and publishers over the years, who viewed the ideas as threats to their profits. Electronic storage and distribution doesn't change anything other than the associated costs.
      terry flores
    • There are a lot of problems with the current model

      that are not easily fixable. You spend all that time writing, reviewing and publishing and you only get six percent of the wholesale price. After you do all of that there is no guarantee that you can even sell the textbook to the schools. Also, I don't know but I would bet that everybody in the chain makes more money from your book than you do and you know that there's no way that they're going to give up that money.

      I don't know the difference between the wholesale price and the retail price but I'll bet it's a lot more than six percent. I'd go farther and bet that you don't get a penny of the used book sales at the college bookstores.
      Beat a Dead Horse
  • I can't do it

    There's just something about destroying books I can't do. I think its all those hours in libraries as a child.
  • Missed pages?

    How do you know two pages didn't stick together and there will be information missing when you go to look at the file after destroying the book?
    • pages usually have numbers at the bottom. ;) n/m

      • Better question

        How do you avoid the issue of screwing up and *not feeding a page* into the scanner? Then forgetting (or not noticing) the page is gone before you destroy the original.
        • Easy. Don't screw up.

          Also, stack the cut pages in order. The Scan Snap holds 50 sheets and a
          mis-feed is pretty obvious.

          Then I scan through the PDF is make sure all the pages are correctly
          oriented and in order. There's a bit of a learning curve, but not much.

          Robin Harris
  • RE: Cut. Scan. Read.

    I don't feel bad at all for the publishers. One of the biggest scams in the world are college text books...
  • Horrible

    And one day the disk crash, and another day another disk crash, and OCR recognition do not works perfectly, and you can read "8ook5" as "books" in many words, just for example.

    It is simply the production of digital garbage.
    • OCR is in addition to the scan

      The scan gives you the original text as it was printed - not as it was
      OCR'd. In my stat book you've got all the graphs and figures as printed.

      After OCR'ing you can then select, copy and paste the text - and THAT is
      when you'll see any OCR errors. But with printed materials - not
      photocopies - the OCR is close to 100%.

      Net/net: it works quite well. And I maintain 3 backups of the material so
      I can survive a even a house fire.

      Robin Harris
      • Re: Searching the PDFs?

        Intriguing. What OCR software do you use? And, can you search within the resulting pdf, or do you have to search a separate file?
  • Um ... do you really have that much time on your hands?