Tax on What? Taxonomy on the Intranet

This is a guest post from one of the world's leading taxonomy and folksonomy experts Thomas Vander Wal, who can be found at Infocloud Solutions.Thomas has very kindly taken some time to share some fundamentals around the organization of information, which are fundamental to the 'walk before you run' aspects of organizing stuff so it's findable and contextually useful.
Written by Oliver Marks, Contributor

This is a guest post from one of the world's leading taxonomy and folksonomy experts Thomas Vander Wal, who can be found at Infocloud Solutions.

Thomas has very kindly taken some time to share some fundamentals around the organization of information, which are fundamental to the 'walk before you run' aspects of organizing stuff so it's findable and contextually useful. Many people hope the installation of shiny new technology will somehow solve these problems: the reality is that it takes the type of planning Thomas describes below...

One of the most common intranet complaints expressed from employees to those who manage and maintain the organization’s intranet is, “I can’t find what I am looking for”. Not only that, but employees also have problems refinding it. If intranets had a theme song it would be U2’s “I still haven’t found what I am looking for”

When I come back into the United States from work abroad the TSA/Customs agents always ask “what do you do?” The easiest answer I found for me is, “I am a designer”, while it is not perfectly correct and most designer types would rightly call B.S., it is close enough. Occasionally the agent will ask, “What do you design?” I occasionally state, “I design intranets so that it is easier for employees to find things.” This always gets the response, “We need your help, because I can’t find anything I need, or am required to find on our intranet.”

That little verbal dance is quite sad, given their job and responsibility. But, this statement is not germane to TSA/US Customs, it could be and is, nearly every organization. There are many contributors to this problem: Content doesn’t exist (content management tools that are impossible to simply and easily use); Too much information and out of date content (search issues and archiving policies not used or existent); Can’t refind things (no static URLs and no easy way to bookmark so to store links to content); and Vocabulary problems (no taxonomy or not maintained nor exhaustive enough.

This last common issue, vocabulary problems, is the key focus of this piece. Vocabulary use is central to finding and refinding information. It is central to search. It is central to clustering and aggregating similar and related information.

Taxonomy One of the core solutions involves building and maintaining a taxonomy. A taxonomy is simply a set of words/term that are commonly used and understood by a population of people that have relevance to help understand information and objects will be found connected to that term. One can think of Yahoo!’s original purpose of being a link directory as built on a taxonomy, as well as the Yellow Pages. One of the most common uses inside businesses is the intranet’s navigation structures on its pages. Taxonomies are used not only for labeling things to be clicked on, but also helps improve search for clustering information and objects.

Taxonomies can be take many shapes. Often they are thought of as hierarchies, but they can be flat lists, tree & leaf, matrix, facets, and more. Taxonomy shapes are often determined by the tools that require them, CMS, Portals, Document Management (essentially any ECM), CRM, supply chain tools, enterprise search, etc. The value derived from many of these tools is often dependent upon the quality of the taxonomy (value from these tools is also dependent on ease of use, which few have and is a completely different problem).

The key to taxonomies is less what shape they are, but how broadly the terms used are understood by those in the population using them and how granular they are. Many times I have been asked to help improve finding and surfacing information for those wanting it, only to find 2,000 or more videos only labelled video and nothing else. The taxonomy needs to be broad enough and granular enough so that those labeling things

Getting the a taxonomy right or on the path to being helpful takes a lot of effort. Understanding the broad differences in terms used across an organization as well as the nuances that exist in the use of the words by different disciplines in the organization, as well as different backgrounds. Things get tricky when capability to have marketing people and engineering R&D people successfully use the taxonomy to find things, but also the 600 new people that joined the organization from the last merger. The focus of the taxonomy should be on the people using the taxonomy who are applying the labels for clarity, as well so that those seeking things will find what they need.

It is quite important to get a taxonomy set up and use it. But, the difficult part is no taxonomy is ever complete, particularly in larger organizations. But, even if it were complete it will be out of date with in days if not months do to changes in staffing, training, and interaction with customers and the outside world with different terms that influence internal terms. Keeping the taxonomy up to date is quite an effort and often done in cycles, every couple or few years. The costs of these efforts is rarely inexpensive, but the payoff is large, when done well.

Folksonomy Counter to taxonomy cost and effort, there are many who consider folksonomy an option. What is folksonomy? Folksonomy is a term created by your truly in 2004 to set apart the types of tagging that were happening in Delicious social bookmarking tool and Flickr photo sharing site that allowed the people consuming the content to add their own terms to it for their own use as well as others. Tagging in the folksonomy sense requires three metadata components: 1) The object being tagged; 2) The tag being applied to the object; and 3) Identity of the person placing the tag. The most important piece of this tagging done by the person consuming the content it the addition of the identity.

Prior to Delicious and similar services tags were being applied as if all people were identical in their understanding of terms being used and applied. This in many cases was very clearly a wrong assumption, none more so clear than in a music tagging service Bitzi where any music referenced on the Web and was addressable by a URL could be tagged by any of the Bitzi users with an account. But, one person’s “progressive rock” was not another person’s “progressive rock”, which quickly devolved into arguments and a lack of faith in the system. But with the addition of identity you could choose to follow the identity who used “progressive rock” which you agreed with and ignore the one you didn’t. This also allows for seeing a breadth of difference in terminology and vocabulary in population around a single object.

The advantage to a folksonomy is it is cost effective in that all of the tagging is being done for free by those consuming the content. It quickly has a positive impact on search, if the tagging service is pointed to and included in search. It also easily allows people to put all things found into their own context for their needs (project, terms, cycles, etc.). It helps remove one of the most difficult technological pain points, refinding information a person has already found.

The downside to folksonomy, on its own, is it takes time to develop to something broadly helpful. While it is relatively cost effective given that the contributions to the system are free its lack of central structure can be confusing. Finding things purely on folksonomy tagging can be problematic, unless using a really well built tool or service.

Bringing Taxonomy and Folksonomy Together

Both taxonomy and folksonomy have very strong advantages for use inside an organization. But, both have the less than optimal sides too. But, what is interesting is when combining them each of their strengths accounts for the other’s deficiencies (as seen in the table/graphic).

The Biggest Wins The two areas where these two different approaches benefit most from each other are the cost and effort of keeping the taxonomy up to date (emergent and validated) and the non-structured (some call messy) method from folksonomy.

The need to keeping a taxonomy current and up to date is essential for it to have continued success. But, but effort needed to survey & research what people are calling things to keep the taxonomy up to date puts it in the category of a once every few years sort of effort. One of the things the folksonomy approach does really well is validate existing terms (this object is called “X” in our taxonomy and an employee has also tagged it with “X”), but it also identify gaps (employee tagged something in our taxonomy as “X” as “Q”). This potential gap has been identified and becomes a candidate for potentially adding to the taxonomy. Until that is done the object can still be found related to “Q” through search, just because it has been tagged with it.

Many larger organizations have people who are taxonomists on staff, but perform many tasks and roles. These internal resources are the one who can take the new contributions and decide if and how it fits in the taxonomy. The most resource intensive part of the project has been done, but the human need to sort out where it fits is still required. Smaller organizations will need to bring in a taxonomist (hopefully with familiarity of folksonomies) on occasion for short stints to perform the same task as they would in a larger organization. The cost shift is in the duration of the work and the size of the team needed to perform the task.

While folksonomies are cost effective means of keeping a taxonomy valid and up to date, they lack the efficiency and clarity that taxonomies provide. The ease of using a well used tagging service that generates a robust folksonomy still needs to map to the structures needed from a taxonomy in the various tools that require them and depend on them for their success.

Editorial standards