IBM's Sutor asks how you share documents. Wrong question, right time?

IBM's Sutor asks how you share documents. Wrong question, right time?

Summary: In his blog, IBM's open source/open standards veep Bob Sutor asks "How do you share documents?"  The post briefly delves into the history of how paper and eventually digital documents are created and passed around, recommends using PDF if document requires no further editing by recipients and ends with: A word processing document in some sense represents the raw material of a document on its way to final production.

TOPICS: Software

In his blog, IBM's open source/open standards veep Bob Sutor asks "How do you share documents?"  The post briefly delves into the history of how paper and eventually digital documents are created and passed around, recommends using PDF if document requires no further editing by recipients and ends with:

A word processing document in some sense represents the raw material of a document on its way to final production. Once it gets there or if you just want me to read the content, send me something that I can read on multiple platforms with no trouble. Of course, in my particular case, if it must be a word processing document, make it ODF.

Sutor has been IBM's chief OpenDocument Format (ODF) evangelist, especially since it entered the public spotlight last year.  Setting aside the format debate for a second, Now is the time to let the Web force you into rethinking how you share knowledge. I couldn't help but wonder, given Sutor's post, if "documents" by itself is the wrong word in the question How do you share documents?  Sutor accurately zeros in on one of the problems with document sharing while hinting at another.  Do you really need the heft of a word processing document format to share the document or can you go with something lighter weight?  When you do share a document in a heavy word processing format, are you better off when such sharing can happen regardless of the application software in use by the recipient? (the insinuation is that you're likely to find broader support for ODF than you are for alternatives)

Both are fair questions.  But,  the word "document" stacks the deck in favor of what some might consider to be Draconian approaches -- documents, email, and emailing documents -- to sharing and collaboration in the first place. In fairness to Sutor, there are plenty of use cases that require those approaches.  But even when considering those, perhaps we're better off thinking in terms of sharing knowledge and information before thinking of documents.  By freeing ourselves from document-oriented thinking, we clear the way towards developing more efficient collaborative infrastructures (as opposed to tying ourselves to the older less efficient ones).  So, before we encode that knowledge into some sort of document and debate what the format it should be stored in before we mail it around, why don't we think about the most efficient way to share that knowledge and work from there?

Between blogs, wikis, static Web pages, and RSS, now is the time to let the Web force you into rethinking how you share knowledge  today.  When he was president at Userland (he's at Apptran now), Scott Young told me:

When we go in and we look at these situations and what we see is pretty much complete reliance on e-mail as the default publishing mechanism and the default knowledge repository within an organization, and that's a little scary because e-mail is not searchable, it's not accessible to anyone, it's completely distributed and most of the time if somebody leaves the company, they just simply just wipe it all out, even though it's everything that person ever did and everyone that they were in contact with and all that information is simply lost....E-mail is going to be appropriate for certain kinds of conversations and it's going to be inappropriate for others and one of those is the publishing kind of capability. So if you're writing your department report, post it to your department Web log and make sure people can see it by subscribing to it. And then it takes the burden off of you to have to ensure that everybody in the company sees it.

Blogs make for an excellent knowledge sharing medium when your in Bob's PDF mode: the mode where the knowledge needs to be published but isn't editable by the group.  That knowledge is Web-based  (and who doesn't have Web access?) which makes it infinitely more searchable (and integrateble into other knowledge by way of hypelinking) by the organization. Through RSS,  recipients can be notified of its availability as well as any changes. Comments (a form of collaboration) can easily be filed by "recipients."  And if something that's inherently more distributable (write once, circulate to many) like PDF is required, RSS supports the notion of enclosures which, to Sutor's bandwidth consumption point, are far more bandwidth efficient than email attachments since distribution across a network isn't a foregone conclusion (with blogs, the "recipient" has to want the file before crosses the wire). 

For all intents and purposes, wikis are blogs that have exchanged the diary-like posting format for the ability to let multiple users edit the same piece of content (aka: collaborating on knowledge).  In other words, instead of sending a editable document around, host it as a Web page that anybody with access to the wiki can edit.  Wikis also support RSS (notify me when this wiki page changes).  Revisions can be tracked and restored.  Content can be edited with user-friendly WYSIWYG tools. Traditional content management systems, look out!

Does this work for knowledge of all types the way a word processing document might? Maybe not. But things are improving.  For example, there's certain types of knowledge that only a spreadsheet can capture. Last week, we had to pass spreadsheets around to share them. Reconciling changes by Mary, John and Sue were a bitch.  A few days ago, with ODBC strapped to Excel spreadsheets (IT department required), we were able to publish them as uneditable HTML tables (even SQL queries worked!).  Today we have Dan Bricklin's WikiCalc: Web-based spreadsheets that, in true wiki fashion, are editable by multiple users.  Right there on the Web page!  Relatively speaking, it's a new technology.  WikiCalc is in alpha (beta is coming soon, according to Bricklin).   But its newness shouldn't be confused with the benefits.  Are wikis getting better at accomodating richer content?  As evidenced by solution providers like Jotspot and more recently, WetPaint, yes.  Can wikis or and the Web stretch their tentacles over the air into mobile devices like BlackBerries like email?  As evidenced by SocialText's Miki, yes.  In fact, Web access to such mobile device is far more pervasive than the ability to open speicfically formatted documents that arrive as email attachements (let alone edit them).

OK. So, at the end of the day, whether it's a blog or a wiki, the knowledge is still technically encoded into a file which most people equate with a digital document.  Actually, in many cases -- for example Wordpress and MediaWiki -- "collections" of knowledge (blog entres, wiki pages) ares stored  in PHP-retrievable records in a MySQL database.  But we see those collections on Web pages that, for some, conjure up the notion of  documents.  But to most, document means an Office document (Word, Excel, etc.) or a PDF document and starting with the sort of thinking that you have one of those documents to share is the wrong way to go about it.  Think about freeing your knowledge.  Then worry about the format after your thinking leads you to regular document land (which continues to be appropriate in very many use cases).  Not only that, as I've written before, if knowledge encoded in wikis and blogs absolutely has to be forcibly removed from its resting place, there's no reason a file format like ODF can't be put to use then.

Topic: Software

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.


Log in or register to join the discussion
  • The biggest thing, we need to forget about formatting for 8.5x11

    And, since Word processors are 90% about formating for printing on 8.5x11 paper, they are a very, very poor way to share, colaborate, exchange ideas, etc. You spend all your time making sure it prints ok, never mind that you don't need to print it. I still laugh at people that think you have to have MS word with Sharepoint if you want to colaborate. As if you can ony colaborate with printed documents.

    Another thing, we need to get over requiring that the final output of some work such as projects, grant applications, analysis, studies, etc, should be printable output. We need some standard way of putting everything together in a single "document" that can be archived, and we can point to as the final output of some work, without requiring it to be printable.

    Paper is so yesterday (well it should be).

    Well, great blog!
    • Oh my, I think I've just bust a rib on this one..

      Anyone remember the tale of the "paperless office", circa 1970(-ish)? Thirty years on and it still hasn't happened. Simply put, there are always going to be people / situations when only a hard copy will cut the mustard. This means that ALL documents still need to be formatted for paper, because you never know when someone will want a hard copy. If you risk not formatting your documents for paper, how do you know that it will be laid out how you intended when the user formats it to print it?

      A paper-reduced office? I'll go with that, but a paper[b]less[/b] office? Not for a LONG time yet. In 60+ years of computing history there still hasn't been a major revolt away from the trusty pen and paper, and IMHO nor will there be in the next 60 years. Show me a computer media that can store documents for 1000+ years...
      • Right...but not quite

        You are right: you will always have situations where nothing but paper will do. But, you don't need to author as if you will be on paper! I'm telling you, we have a team of tech writers that are documenting our company's product (which will go to print!!), but are doing it using an XHTML authoring scheme whereby the XHTML is transformed into PDF for print.

        And it works great!!
      • But the format for print...

        The format for printing should be a delayed event, only
        necessary at the point where the information is actually going to
        be printed.

        too often we're faced with form over substance. If I'm not going
        to print the document, then formatting for printing is a waste of
        time. If, you need to produce a hard copy of the document, then
        at that point you should make the decision to format for

        You make it sound as if the format is more important than the
    • AND forget 8.5x11 format w/online docs!

      How ridiculous is it to receive a manual on CD along with a new product (note: no print version, just CD) and the pdf is formatted for print. I mean the CMYK registration marks, offset info, all of it. And you're stuck navigating around a messy document dealing with page offset dopiness, i.e. TOC lists print doc page 23, but the pfd page number is 29, etc.
      This makes one assume that the company that sent it has little collective common sense, and that their products may be of the same caliber and best be avoided.
    • Forget about paper !

      Why on earth do we send documents to be used on a display in US letter, or as is more commen on my side of the Atlantic A4 ?. In the 80'ies Apple produced a 'document format' display i.e 'portrait' format, but since then displays have been mostly rectangular, with a tendency to TV letterbox format on laptops. That means either scrolling, or leaving to wide unused parts of the display both sides. Areas which could be filled with information and put to use.

      Why on earth do "we", with the exeption of very few, send finished documents in a editable format? Sending businnes correspondance which can be modified by the receiver ??. And if it is a document that IS to be modified, we send it in a format needing SW from a specific suppplier only running on a specific platform. And even adding that the format is closed, so you have to hack your way into interoperability with anythiong else.
      How do we even ensure that the information is readable in 100 years ?, even now older documents give big problems when opened in newer versions of the SW that created them. One solution to this is of course that formats are open, welldefined,well documentated, and certified by a standard organisation like ISO or ANSI.

      The situation now, will not give us any credit from the generations following us.
  • And who doesn't have web access?

    The question was asked in jest in this story however for those who do not live on the coasts of the United States it is a very real and excluding concern. Only the coasts and large cities have prevailing, high speed, affordable and reliable web access. The model presented for web based information interchange only works for the urban and /or affluent. Not for the users with dial-up available only and without the dollars for very expensive cellular/satellite connections. When the web becomes the mechanism of commerce the third world and the poor will the poorer for it.
  • You seek "single source" documents

    Before I start: I am currently working as Content Management Technical Architect with one of the top 10 ASP's as published by a ZDNet article not too long ago (I think salesforce topped the list???). Anyway, we are working on what is the solution to this problem for internal use only.

    The solution: a single source of content editing. While there was plenty of early support for DITA (another IBM proposition that can also be found to at OASIS' web site) as our choice of standard, but it requires the user to be hands on with the XML, or to pay for an XML editor that is overpriced for what it does.

    What is single source? It's the answer to your question. What the web needs now, is a simple XML formatting standard that is easy to follow and easy to reference. It also needs to be easy to publish. Content management vendors missed the boat on this one because they try to keep everything proprietary-- keeping their system in the loop. I say to hell with them.

    My solution is simple... an XML document with two sections: metadata and content. Publish the document and syndicate its sections. Subscribers can select which XSLT to apply and they can get the document delivered as MS Word, Adobe, Compiled Help, XML, HTML, or any other markup for which an XSLT can be made. The metadata would contain relevant info (tagging), the contents could be stored as XHTML. This allows every web based WYSIWYG editor to create this format, and everybody else to reference it. It also allows for easy transport over web services (portability from one service to another!!)

    Unfortunately, single source is already an industry movement and will be killed by corporate posturing and the #1 marketer of their own format.

    They key to single source is the separation of the formatting from the content. The beauty of using XHTML is that XSLT can be used to strip away unnecessary formatting easily for the subscriber.
  • Question should be How SHOULD we share documents

    Emailing documents and pertinent messages will never go away, especially when in regards to business to business information. But this type of "sharing" is usually meant in a limited fashion, as in, "I'm sharing this info with you because it's imperative to us doing business, but please don't share it with anyone else (competitors, jounalists, etc.)." Implying this is proven to not be enough. It is common practice to forward communications regardless of whether the person who emailed it to you is okay with that or not.

    How SHOULD we share documents? When you want to maintain control over who can and cannot access emailed data, rights management and encryption should be applied. There are now easy to use desktop solutions like Taceo that anybody can use:
  • "knowledge and information"

    I agree that asking about how we all share our documents is the wrong question, but I am not sure that "we're better off thinking in terms of sharing knowledge and information before thinking of documents." For one thing I do not think we have recovered from the knowledge evangelists making a complete hash of the concept of "knowledge" (into which bits and pieces of the concept of "information" were also ground); and I do not think any of us would benefit from yet another stab at defining knowledge. (Personally, I think that Plato's "Theaetetus" is still our best source. Socrates sets the young Theaeteus the problem of defining knowledge. The young man makes four admirable attempts, and Socrates shoots down each of them. So the dialogue ends without any conclusive definition.)

    Nevertheless, as one who likes dialectical reasoning, I think it is desirable to seek out SOME concept in opposition to "document." However, recently I have been trying to train myself to use the phrase "communication artifact" instead of "document," because I think the important attribute of the document is its artifactual nature. Indeed, it is BECAUSE it is an artifact that we can have debates about format, since those are debates about whether or not there should be rules that govern what constitutes a well-formed artifact. However, following John Seely Brown and Paul Duguid, I suspect that the appropriate opposition to "communication artifact" is "work practice" (which, in a way, amounts to recognizing the distinction between what we say and what we do). Ultimately, our behavior at work (or at recreation) is effective on the basis of what we DO. What we do may be informed by documents we read (or write) but there is not end of evidence of the limitations on what and how documents can inform us. Thus, we should not be asking how documents are shared in a work setting; instead we should be asking how effective work practices get effectively REPRODUCED (to use a favorite term of Anthony Giddens) across the spatial extent of a large enterprise (or, for that matter, a small one) and across the temporal extent through which the past informs the present (the Neustadt-May concept of "thinking in time").

    For some time I tried to develop this dialectical opposition in terms of the opposition of NOUNS (as in artifacts) and VERBS (as in practices). However, I have recently become very absorbed in Kenneth Burke's GRAMMAR OF MOTIVES; and I am beginning to have doubts about using this as my foundation. There are a variety of other candidates that Burke explores, particularly in his analysis of Spinoza. One is the opposition of the passive (artifact) and the active (practice) or, from a slightly different point of view, the product and the producer. There is also the opposition of the object and the subject, which I think is particularly important in any organizational setting, since the PEOPLE are always at the heart of any organization.

    I am also particularly fond of an aphorism coined by my colleague Noam Cook: "Knowledge cannot be shared, but it can be made sharable." For me this is a good crack at finding a synthesis across the opposition of artifacts and practices; and I hope it will diminish some of the passion currently applied to the more secondary question of how we format our artifacts.