As long as the reach, bandwidth, and targeting of networking technologies -- particularly the wireless kind -- continues to improve on a nearly Moore's Law like pace, relational database management systems as we know them may eventually be a thing of the past. So said Gartner analysts Donald Feinberg and Ted Friedman at Gartner Symposium ITxpo in Orlando, FL during a session entitled "The Death of the Database."
The premise of Gartner's argument is that as improvements in networking technologies eventually lead to real-time
connectivity to any data, that that data is best kept closest to its natural source rather than at the intersection of a row and tuple of a database that, as it turns out, is actually little more than a remote cache. An RFID-tag equipped can of soup was given as an example of why inventory data needn't persist in a database in order to facilitate the business processes of a grocery store.
Instead of walking the aisles, taking inventory of everything on the shelves, and then storing that inventory data in a database, the Gartner analysts said to just leave the "data" with the can of soup. Then, in the business process of restocking, the nightly, hourly or however frequently scan of all the RFID tags in the store bypasses the step of storing the inventory data in a database and goes directly to placing an order for more of that can of soup. Not only does the resulting business process come closer to achieving real time timing, but a step is eliminated from the process. Said the analysts "if I only have a millisecond need for persistence, processor and memory can handle that. The data ends up existing for less time than it takes to store the data."
At one point, Feinberg picked a more morbid example but it really made the point of questioning how, when, and where data should persist. Feinberg rhetorically asked where his health records are better off being stored: in a database in California, on a credit card in his wallet, or a chip that's embedded in the back of his hand. The answer, as you can imagine, was in a chip in the back of his hand. It's there that the health record of Donald Feinberg stands the best chance of always being as up-to-date as possible; at least moreso than in a database across the country that a local hospital in Orlando, FL (should Mr. Feinberg require emergency care) may not be able to access (or update) in real time.
Feinberg said that we could store our health records on a credit card that gets stored in our wallets, but that in that location, the data is already further away from its source than it needs to be. Feinberg talked about how people can become separated from their wallets in the course of an emergency and then jokingly talked about how, if we become separated from our hands, we may have a problem that's too serious for our health records to be of much help.
The point made by the Gartner analysts is that there's a bit of urban myth to the idea that data must always be stored -- or cached -- in a database. Sometimes when you really think about the business processes that the data must support and then the degree to which the data must persist to support that process, you may realize that you don't need a database after all. As data is moved closer to its source and only kept in one place, not only is the quality is better, according to Friedman, "the data is where you need it, when you need it and only lasts for as long as you need it."
To prove their points, the analysts talked about how data is becoming more and more distributed and the need for databases to house that data is becoming less and less, the analysts talked about how, in the future, only 20 percent of the data that's stored will be structured data anyway -- the kind of data that is stored in a database and can be accessed with the Structured Query Language (SQL) for querying relational database management systems (RDBMS). The result is that structured data and SQL will take a back seat to XML and XQuery. "Searching [unstructured data] will be important" warned the analysts. "Hence the high value of Google."
The analysts also warned that structured data and SQL won't be the only things that take the hit as a result of highly distributed and often unstructured data. Database administrators (DBA) as we know them today could be an endangered species as well. "I don't need a DBA to manage the data on a can of soup" said one of the analysts (I can't remember which one).
Feinberg and Friedman advised the DBAs in attendance that they should be thinking about which of two new roles they'd prefer; that of a repository administrator or that of a data service administrator. Whereas the former's job is to know all there is to know about the data (where it's located or should be located, how it's structured, how it's modeled or needs to be modeled, etc.), the latter's job is to manage the data services that are consumed by an organization's various applications and business processes. The latter's job also implies that many of the former functions of the database as they relate to the data (ie: reliability, security, policy, etc.) become the job of the services layer of the software infrastructure rather than the RDMBS layer (in other words, services oriented architectures or SOAs play a big role in this somewhat database-less future).
Not only well the RDMBSes decline in relevance due to the distribution of data and heavier reliance on the services layer, Feinberg predicted that the increasingly real-time nature of the entire software infrastructure means that business intelligence (BI) -- normally a function reserved for a discreet class of software -- will be woven directly into the line-of-business application layer, the result being a decline in relevance of BI software as well. "The press can quote us," said Feinberg. "We're debunking BI. It's not an application anymore. It's a service that's accessed when and where it's needed and [as said earlier] the data persists only as long as we need it."
The analysts also identified the one caveat where databases will continue to play an important role: where, for reporting purposes, retention of historical data is a requirement for exercises such as long term analytics. Some data will need to be kept. But not all. In terms of recommendations, the Gartner analysts suggested that user organizations take the following five steps:
- Develop new applications for DBMS independence
- Develop new policies for persistence based on other mechanisms besides DBMS
- Create clear service levels for persistence of information when designing systems
- Foster overlap between middleware and DBA skills
- Identify vendors concentrating on the services and policy vision