Who needs metadata anyway?

Last fall, OMB and GSA put out an RFI on the question of metadata: Is search technology good enough to obsolete metadata? The results were mixed.
Written by ZDNet UK, Contributor

The government is infamous for deploying heavy-handed metadata regimes. But Google returns pretty darn good results from millions of pages that are largely  metadata-free - they certainly boast no structured metadata. Perhaps, federal government managers wondered last fall, the government could save bazillions by jettisoning metadata costs entirely. OMB and GSA asked industry experts in an RFI what they thought. In basically a red state-blue state breakdown, 56% said you don't need metadata to prepare docs for search enginges. On the other hand, 44% said you do, according to Government Computer News.

One search engine that really matters is Vivisimo, which has the contract to modernize the gov's FirstGov portal and search service. 


“Taxonomy building and metadata generation is expensive and laborious,” said Raul Valdes-Perez, Vivisimo chief executive officer. Valdes-Perez said he has seen many metadata projects get bogged down and never completed.


 But the RFI proposed some quite complex search scenarios, and Brand Niemann, co-chair of the Federal Semantic Interoperability Working Group, doubt these could be pulled off without some form of tagging. "What they are looking for is not search but knowledge computing," Niemann said.

Obviously, the problem with Google is that it doesn't have any context. It searches through everything looking for content, but it's context is completely vanilla. There's zero metadata cost, though. Dedicated database searches offer lots of context but has high costs both in search complexity and metadata prep.

Absent from the discussion, at least in the GCN article, is the potential of unstructured tagging, the so-called user-generated metadata made popular by delicious. Systems obviously can be created in which document creators and consumers are both motivated to provide ad-hoc metadata, which by definition is useful to someone - something the top-down regimes can't guarantee. The cost of informal tagging is really minimal, in fact it's usually not seen as a cost but as a benefit by those doing the tagging.

Its' probably not ready for primetime, but it's possible that it's the best way to exploit powerful fulltext searches while minimizing burdensome metadata procedures. 



Editorial standards