According to the Wikipedia definition, big data "is an all-encompassing term for any collection of data sets so large and complex that it becomes difficult to process". The Information Commissioner's Office (ICO) also believes that big data is a large and difficult area — but for a reason other than its sheer size.
In its first report on big data (PDF), the UK's data watchdog sets out "how big data can — and must — operate within data protection law".
In particular, the report explains how the law applies when big data systems make use of personal information, with the aim of showing "which aspects of the law organisations need to particularly consider" while helping companies "stay on... the right side of the law and still innovate".
The ICO's 50-page report tries to tackle the key issues around big data and data protection — no mean feat, not least because many people see Big Data as being to a greater, or lesser, extent incompatible with data privacy.
For example, the Data Protection Act's second data protection principle — known as 'purpose limitation' — creates a two-part test for data controllers: firstly, is the purpose for which data is collected specified and lawful; and secondly, if the data is further processed for any other purpose, is it incompatible with the original purpose?
As the report says: "It has been suggested that big data challenges the principle of purpose limitation, and that the principle is a barrier to the development of big data analytics."
This is because most users of big data analytics see it as "a fluid and serendipitous process, in which analysing data using many different algorithms reveals unexpected correlations that can be used for new purposes". So the 'purpose limitation principle' restricts an organisation's freedom to make these discoveries and innovations.
In other words, it stops big data doing what it is meant to do.
The commission has the answer to that: "What is called 'purpose limitation' could more accurately (if more awkwardly) be described as 'non-incompatibility'," the report says.
The Data Protection Act "does not say that processing for a new purpose is not permissible", the ICO adds, "nor does it say that the new purpose must be the same as the original purpose, nor even that it must be compatible with the original purpose: it says that it must not be incompatible with it".
It's a thorny matter, and one that may be open to a wide degree of interpretation. Steve Wood, the ICO's head of policy delivery, recommends transparency when using big data systems.
"What we're saying in this report is that many of the challenges of compliance can be overcome by being open about what you're doing," he says. "Organisations need to think of innovative ways to tell customers what they want to do and what they're hoping to achieve."
Earlier this month, the ICO said that it had dealt with a record number of data complaints made to it by members of the public.