The future of enterprise data in a radically open and Web-based world
In recent months, another significant front in the growing trend of open data has emerged, and with it a growing focus on what businesses can do with that most precious asset they’ve developed at enormous expense over the years: their data.
Like many aspects of applying Web 2.0 to the enterprise, the challenge is both in adapting the business and its thinking while successfully leveraging the latest delivery methods.In recent months, another significant front in the growing trend of open data has emerged, and with it a growing focus on what businesses can do with that most precious asset they've developed at enormous expense over the years: their data.
The advent of a new administration in the United States, which has been pushing to open U.S. government databases en masse, and a proliferation of open data initiatives in other countries -- perhaps most notably in the U.K. -- has put the often behind-the-times government world into the forefront of open data with such sites as data.gov, which the nation's CIO Vivek Kundra has promised will have tens of thousands of feeds this year alone.
All of this activity underscores the relatively lackluster track record of traditional businesses in understanding and managing the opportunities, risks, and rewards of open data. Despite some significant success stories there is an apparent -- and perhaps widening -- digital divide between the classical world of business and the online world.
Even the considerable investments that most large organizations have made in IT system interoperability and integration, particularly with such popular approaches such as service-oriented architecture, have produced famously lackluster results. My good friend David Linthicum, a leading SOA expert, has gone as far as saying that the lack of focus on data is a major part of the problem.
Taking a product focus instead of a project focus
For those that have embarked down the open data road to see where it leads, one thing seems to be clear: Exposing data -- whether it is internally within an organization or outside to partners, or even the whole world -- is a way of thinking about the very nature of the business, more than it is about achieving a one-off end goal. This is because open data seems to create immediate, close, and powerful relationships between the publisher and the consumer of the data, and leads to a series of unexpected outcomes. These relationships can be created with extreme ease with today's methods over networks like the Web and though often speculative, a good subset of them form rapidly into important ones that can draw in new customers, identify new innovations, head off competitors, or just generate revenue. Witness Twitter and its hundreds of partners accessing the platform (and its enormous audience) through its API or Netflix and its impressively successful prize contest that opened up data selectively to dozens of high-value self-selected contributors as a leading example.
In other words, in order to be competitive with the next generation of businesses, most organizations are going to have to look at open data for reasons involving efficiency, competitiveness, and long term health, particularly as open data enters their particular industry.
Enterprise open data options: Leveraging today's Web best practices
But it's still not clear to businesses the options they have and how they need to think about opening up strategic sets of data for reuse internally, with their partners, and indeed, with the rest of the world. Far from being a story about IT plumbing, open data is a way of doing business, forging strong relationships over the network with other organizations, customers, and potential customers. However, the success of the Web itself as a dominant global platform has made it the de facto channel for providing open data, even the networks internally to most businesses heavily use Web technology for their applications, intranet, and interaction with the rest of the world. This means opening data generally means opening it up over the Internet using Web technology and approaches.
So critically, being successful with enterprise open data requires understanding of what's been learned by the open data community during the Web 2.0 era, which could be summarized as a long succession of hard knocks and lessons learned to arrive at today's models for sharing data (as opposed to embedding information in Web pages, which one can almost refer to as the new silo.) A quick glance at open government exemplar data.gov can provide a quick microcosm of the problem: the site largely provides a catalog of old-school data sharing with a focus on historical data extraction via files that are downloaded.
This is in sharp contrast to the online space today where live access is offered to the most current data using feeds as well as APIs that can provide queries, reports, and vitally, two way communication with the data publisher. Two way communication allows data to flowing back into the publisher as needed and appropriate (which it usually is and is where most of the value often comes since consumers of open data are often the most valuable contributors as well).
In fact, the Web, through long trial and error, has generally settled on three major ways for data to be opened up to other entities on the network. These three ways are all quite suitable for enterprise use, both in internal efforts such as application development (especially enterprise mashups), SOA, and even BPM, but also to the discovery of data via such functions as search, since some of the popular approaches to open data on the Web greatly facilitate the location of data. Information discovery is something that many an Enterprise 2.0 effort has learned is of very high value as their participants surface long-submerged data out of silos out into the flat open plains of Web-based intranets.
These three open data publishing methods have various strengths and weaknesses but all require a strong "this is a product that businesses will build on" sensibility to succeed. These methods are as follows:
Three roads to enterprise open data
Web-Oriented Architecture. This model is an increasingly popular one that uses the basic protocols of the Web to expose data to Web-friendly sources. At its heart, WOA uses an architectural style known as REST, which be enormously powerful in terms of creating scalable sets of two-way open data. Since WOA uses technologies that most companies around the world already have an end-to-end set of tools for including management, monitoring, governance, and more, it aligns well with the Internet and businesses both. Because WOA encourages data to be stored in a Web-like structure online with addressable permalinks and deep inbound/outbound links, it's very search friendly and can potentially enhance enterprise data discovery significantly. A potential downside is that REST requires a good understanding of the Web to implement fully and is by itself unable to meet some enterprise requirements such as certain advanced security protocols. However, WOA is capable enough that it's a compelling alternative to traditional SOA for a large range of applications. Importantly, virtually all IT environments of any kind can readily work with open data sets using WOA as their model.
Open APIs & Lightweight SOA. An API is a well-defined interface that allows other parties to interact with your data in a highly controlled fashion. In general, an open API lets 3rd parties integrate on-demand with a business and comes in two primary flavors: traditional and Web-oriented. Traditional APIs use older technologies such as SOAP or the WS-* protocols to provide an endpoint to the world and will be eminently familiar to most corporate developers. However, they will be less consumable by many potential online partners since many enterprise technologies are not very popular, or intrinsically effective, on the Web. There are also fewer emergent outcomes using traditional APIs versus Web approaches such as WOA or simple models such as XML over HTTP (that latter of which is the most common technical approach for open APIs today.) In general, this approach is favored for internal APIs and SOA or for organizations that are sure they won't need broad 3rd party appeal.
Semantic Web & Linked Data. Despite languishing for many years, the next-generation of the Web (also created by the inventor of the Web, Tim Berners-Lee) has had a resurgence in recent years and currently Linked Data, a relatively new movement to unsilo Semantic Web data, is gaining some favor. By far the most sophisticated and complex of the three approaches to open data presented here, it's highly suitable for certain applications that have rich data sets that need powerful means of processing and consumption. In particular, scientific, technical, medical, mapping, and certain government domains are highly suitable for this approach. It remains unclear if Linked Data will finally trigger the boom in the Semantic Web so use with care. However, definite consideration should be applied, given the potential of the approach to create data sets with extraordinarily high function. Businesses already managing their data with Semantic Web technologies will be the most likely candidates for adoption. You can also view Tim Berners-Lee's recent presentation at TED on why your company should put your data on the Web.
So like many aspects of applying Web 2.0 to the enterprise, the challenge is both in adapting the business and its thinking, including making open data a full blown product with marketing, technical support, and legal trappings, while at the same time successfully leveraging modern online delivery approaches such as WOA, APIs, and Linked Data, and emerging new topics such as Web Squared. The boundaries of organizations today are increasingly pushed into highly federated environments that also demand much higher degrees of openness and access to knowledge. Not providing open data will increasingly mean that organizations are cutting themselves off from the avenues of highest new incremental value.
In short, enterprise data strategies that apply the latest Web 2.0 approaches to open data have the potential to reap enormous rewards especially when competitors remain slow to adapt to the digital business environments of the 21st century.
What does the future of enterprise data mean for your organization? Please add your comments in Talkback below.