"Big Data" is a catch phrase that has been bubbling up from the high performance computing niche of the IT market. Increasingly suppliers of processing virtualization and storage virtualization software have begun to flog "Big Data" in their presentations. What, exactly, does this phrase mean?
If one sits through the presentations from ten suppliers of technology, fifteen or so different definitions are likely to come forward. Each definition, of course, tends to support the need for that supplier's products and services. Imagine that.
In simplest terms, the phrase refers to the tools, processes and procedures allowing an organization to create, manipulate, and manage very large data sets and storage facilities. Does this mean terabytes, petabytes or even larger collections of data? The answer offered by these suppliers is "yes." They would go on to say, "you need our product to manage and make best use of that mass of data." Just thinking about the problems created by the maintenance of huge, dynamic sets of data gives me a headache.
An example often cited is how much weather data is collected on a daily basis by the U.S. National Oceanic and Atmospheric Administration (NOAA) to aide in climate, ecosystem, weather and commercial research. Add that to the masses of data collected by the U.S. National Aeronautics and Space Administration (NASA) for its research and the numbers get pretty big. The commercial sector has its poster children as well. Energy companies have amassed huge amounts of geophysical data. Pharmaceutical companies routinely munch their way through enormous amounts of drug testing data. What about the data your organization maintains in all of its datacenters, regional offices and on all of its user-facing systems (desktops, laptops and handheld devices)?
Large organizations increasingly face the need to maintain large amounts of structured and unstructured data to comply with government regulations. Recent court cases have also lead them to keep large masses of documents, Email messages and other forms of electronic communication that may be required if they face litigation.
Like the term virtualization, big data is likely to be increasingly part of IT world. It would be a good idea for your organization to consider the implications of the emergence of this catch phrase.