"It...is a device in which an individual stores all his books, records, and communications, and which is mechanized so that it may be consulted with exceeding speed and flexibility. It is an enlarged intimate supplement to his memory."
Thus did Bush what he called a "memex," which he envisioned as a desk-size appliance festooned with "slanting translucent screens," buttons and levers, and loaded with microfilm. Data entry would be accomplished by means of "dry photography" on a transparent platen--a midcentury vision of the scanner.
Fifty-seven years later, Microsoft researcher Gordon Bell has realized Bush's memex, having entered "nearly everything possible from his entire life" into his computer as part of a Microsoft project at its Bay Area Research Center (BARC) in San Francisco.
Bell, who graduated from MIT with both a bachelor's and a master's degree, was from 1960 to 1983 vice president of research and development at Digital Equipment, where he was responsible for the VAX Computing Environment, among other products. From 1966 to 1972 he was professor of computer science and electrical engineering at Carnegie Mellon University, and in 1995 began his tenure at Microsoft's newly established BARC.
The 68-year-old is the author of, among other books, "High Tech Ventures: The Guide to Entrepreneurial Success," in which he posits the Bell-Mason Diagnostic for analyzing new businesses. In 1999, he helped found the Computer History Museum at Moffett Field, in Mountain View, Calif., and he serves on the boards and technical advisory boards of Cradle Technology, Diamond Exchange, The Vanguard Group and the Bell-Mason Group. Bell's awards include the IEEE Von Neumann Medal, the AEA Inventor Award, the 1991 National Medal of Technology and the 1995 MCI Communications Information Technology Leadership Award for Innovation.
Bell spoke to News.com about his work on the Memex idea, which is variously called MyLifeBits and MyMainBrain.
A: It really evolved over time. I'd read the Bill Gates books, and Nathan (Myhrvold) had been a part of that, too, in terms of saying (regarding storage capacity), "at some point we'll be able to capture everything we have." Jim Gray, who heads the lab at BARC, who's more of a server guy, he and I started looking at how big disks were going to affect what's going to happen to the capacity online. When you do this, then you come up with the question, what are we going to do with all the capacity?
Jim and I wrote an article on the 50-year outlook for computing, and that's when we realized that the amount of storage was so vast that we in principle could capture everything--everything you read, every picture you've ever taken, everything you've said. We're past the point where your laptop can capture everything somebody's read. With digital photography coming in, it allows you to easily store all that stuff, and everything you've heard in terms of music.When did you start actually start capturing your life in this way?
I started this in 1998. I had all this material, and was even moving around with boxes of stuff, some of it going back to Digital Equipment days, and I thought, I'm going to start capturing it. Jim humored me, and we ended up with all this material, and we thought, this is starting to get interesting, and this is something that people should naturally do in their life.
I think it's more natural now because so much of the material originates with a file or it comes to you electronically. I think people are naturally sort of squirrels--whether that's information or thing squirrels. At one point I thought, I want to get rid of everything. I want to melt down the gold medals and that sort of thing. I want my life in a bunch of bits.
Do you really see a market for this kind of application?
The question is, will people do it? I think they will naturally do it, and for certain people, for kids, it may be natural to want to do this. If the tablet (PC) becomes hugely successful, then it's going to all be there. I have no doubt that within 10 or 20 years that will be successful. It squeezes out the paper as the input media.
What exactly do you mean when you say it's all going to be there?
You take all the various information media that you have--for example, your correspondence, all the e-mail or letters that you write and print. All the papers that you wrote or read. One of the capabilities we have in MyLifeBits now is that you turn on the browser or the explorer and capture every page you looked at. The system now captures everything that you see that comes to you electronically.
You've got all that. I'm capturing phone conversations, so those are available. I've got a Sony voice recorder, and that will be another capture device, so in principle every conversation you have could be captured there. The TiVo capability could get you all the TV you'd ever watched. Today we can't do that because it's a gig(abyte) an hour. That's kind of practical, but whether or not you want to save all that junk is another question.
We've sort of gone off on this direction of looking at massive capture, looking at anything that can be--I like the word "cyberized," but a lot of people hate that. Digitized, encoded--we can encode all this material at this point in time, and our research is looking at a how you would do it and how you would rationalize why you would do it.
Why would you? What's the point of saving all this information?
I break this problem in a two-by-two quadrant. There's personal information, then professional information, and there's today. Anything yesterday is almost an archival problem, and anything I'm using today and recently is working. In each of those quadrants, there's content you would do for different reasons. In my professional life, there might appear a book, my paper or some other paper that's of value to me going forward. The working part is really dealing with what I'm doing today.
It's, in a funny way, no different than walking into a knowledge worker's office of 10, 20, 50 years ago. You walk in and look at how old people are by going into a faculty member's office or a researcher's office and they have a load of books and file cabinets and paper stacked everywhere. Maybe they can find something, but I tend to think of the world as really fairly clean. I want to scarf up that material. There's really so much material, but the only thing I really trust is my computer. We really need machines to help us search, organize and hold and to be able to retrieve the vast amount of information that's come into our visual cortex.
You work for Microsoft, a company whose CEO has had quite a lot of
his e-mail read back to him in court. If you start bringing in everything
you've ever said or seen in addition to written, aren't you setting
yourself up for a really nasty subpoena?
If I have any scenario that says you wouldn't ever want to do this, or you couldn't afford to do this, it's that one. You've hit on the one point that's like that.
I guess the counterargument is that if that material ever existed in one place, the likelihood that that track would be there anyway is very high. All I'm doing is making it damn efficient for someone to go in. Right now we have deniability. Deniability is how you treat that now. The CEO who doesn't know that what he's doing isn't right over the edge of legality. That's one of the things with phone conversations. We'll have this in e-mail but not in phone conversations. In phone conversations, it first starts out saying, I'm going to record this.
I'm still not sure I understand the value of recording all this
information, or at least what value would compensate for the substantial
privacy risk involved.
I have value every day in terms of being able to access something that I need to recall. I do everything electronically. At this point--this year is the first I've done it--I go out of my way to make sure that there's no paper. Our scan pile is roughly 12 inches high--that's the amount of material that's been scanned for the year, which includes canceled checks. Even though I can get those from the bank, I wanted more ease of access to them. The material I throw out is roughly 10 times that amount. I don't think that I need it. I throw out a lot of material--like I may invest in a company. Those are e-documents, and in general I don't even keep those myself. I depend on the company to keep those things. Once a day, more than once a day, there's something I'm referring to. An air ticket--I make sure that all the air tickets for the family are on the machine.
I treat paper as a screen dump. That's all it is. It has no significance by itself. The only paper that has any value now are money and stock certificates. I have to keep those as originals. Anything else is in the machine, and on paper only as a screen dump.
The fact that it's there, if there's a fact that I need, I have the computer to go find it for me. How it's valuable is, all the times I would go to a filing cabinet to retrieve a piece of information, I now go to the computer instead. That gets into the storage and retrieval process. I've got the stuff in there, now how do I go find it? We haven't solved that problem to the extent that I'd like to. The system that we're building is that we take all those files, and those are all chunked in a single database, every Outlook message and contact, every document, every .jpg or .wma file, every compact disc, all in a single store so I can find it by uniform search. Should we put all the money transactions in there too? All the contacts, payees?
That brings me to my next question, which is, when do you turn this
When do you turn off the browser and say, gee, I've been visiting a lot of porn sites and I probably don't want that in there? (Laughs.) It's true, a lot of our guys are a little bit queasy now, thinking every time we open a file, that's going to be in there, too.
On the BARC Web site you write, "The technical challenge is ensuring
that this information will be readable by future devices." How are you
going about this?
We're not really addressing it seriously enough at this time. For now what I'm doing is putting a small number of what I hope are going to be golden standards. Right now I'm depending on HTML (Hypertext Markup Language) as being there. Also on .doc being there--(Microsoft) Word. But I feel that HTML is better because there's so much Web page material out there. TIFF is a nice format because it's such a low-level format, and one can imagine that it will always be readable. So a small number of formats. What I don't depend on is Money 2003, Intuit 2003--anything that has its weird own database that I don't think will be readable over time. So I try to reduce that down to something I expect will be readable.
Bush envisioned a contraption that resembled a desk sprouting
screens. What does Microsoft's prototype look like?
Our prototype looks like a mundane old PC, but Vannevar Bush also had--what you sometimes see as a camera mounted on his head. There's one version with a camera strapped to his head that would capture what he'd seen. Our version would be a PC with links to television, telephone, the Web, a way of capturing the cameras. It's a sensing device for all the electronic media.
The Greeks had an idea of Lethe being this very desirable river
because drinking from it would allow you to forget. Aren't there some
virtues in forgetting some things, of letting our limited memories act as
filters for some information?
I feel the opposite, which is that there are times when I think I did something or invented something and I can go back and look at it. It makes me very humble because I can go back and say there was nothing that great about it, I didn't invent that! I am finding it very humbling. This helps you deal with that reality. And I can imagine in another 10 or 15 years, I will get joy out of it. I've observed that people will sit for an hour or so mesmerized by these 8,000 photos that come up. I want to get a screensaver that deals with PowerPoint, anything with images, things that are visually pleasing. Maybe I want to remember it. I don't know--I'm thinking I'll want to remember it, not forget it.