Today's data may be tomorrow's gibberish, Internet pioneer warns

Vint Cerf says fast-changing technology may render many data formats unreadable -- as it has already done in the recent past.

Anyone who has built a massive cassette music collection in the 1970s or a massive VHS video collection in the 1980s understands the problem all too well. Technology shifts can render content inaccessible, and the media useless. Today's data may be at risk of being unreadable someday as well.

That's the warning sounded by Vinton Cerf, co-creator of the Internet in 1983, and currently Google chief Internet evangelist. As reported by ComputerWorld's Patrick Thibodeau, Cerf cautions that "digital things created today -- spreadsheets, documents, presentations as well as mountains of scientific data -- won't be readable in the years and centuries ahead."

The implications are huge: many organizations now have more than a petabyte's worth of data on their premises, containing everything from customer records to knowledgebases on products and services.  This isn't just data being shunted out for archiving somewhere; it's increasingly being serving as actionable insights for guiding decision-makers. Some organizations need to retrieve stored files of data from many decades back, such as local governments accessing property data or scientific firms revisiting research information. When all this was on paper, it potentially would be readable for centuries.

Now, records may not be accessible for much more than a decade. For example, data from 1997 PowerPoint files is unreadable in current Microsoft Office software, Cerf illustrates. Or, what happens if a vendor supporting a particular digital format goes out of business?

For his part, Cerf is advocating the creation of "digital vellum," referring to the mammal skin parchment known to last for centuries.  The cloud may help play some kind of role in this effort, since it separates content from underlying -- and changeable -- hardware and software.  Other than that, the closest thing we have is data that is maintained in non-binary, text or human-readable form. The challenge is being able to store it on electronic media that will remain compatible with whatever technology comes around over the coming decades.

This post was originally published on


You have been successfully signed up. To sign up for more newsletters or to manage your account, visit the Newsletter Subscription Center.
See All
See All