It's reasonable to see big data as a revolutionary development in IT and business in general -- but its implications should make us pause for thought.
ZDNet recently spoke to Bernard Marr, author of the book, Big Data: Using smart big data analytics and metrics to make better decisions and improve performance, to see what's next for the technology.
Q: What is your background?
My background is that I was finishing my degree up in Cambridge, always in the area of data and, especially at that point, performance management. And then I moved through performance management into business intelligence and analytics. So I have seen this whole area from the very basics.
Q: What do you think is the key advantage to be gained from big data?
The advantages are so limitless because we now have data on everything and it can help us get new insights on everything. I see the whole spectrum of big data from NASA using it to analyse real-time data on Mars -- and I find it particularly amazing how big data is use in healthcare to predict treatment plans and to predict diseases.
It opens up completely new avenues in terms of combining data with other things such as robotics where you have smart, intelligent machines that can do a lot of the jobs that are currently done by people. Machines will be doing them much better.
Q: But isn't the quality and usefulness of the data a problem? For example, when you place an order with, say, Amazon, the first thing it does is give you the option to buy all kinds of different versions of the product you have just bought. It isn't intelligent.
With Amazon you have to remember that it is still early days so they are using brute force analysis. The system thinks that people have bought this so they will buy it again. There is very little true intelligence in a lot of their predictions.
But I think this is going to change as they use new forms of data, and especially data from new social media feeds and other things. So I think it will become much more accurate.
Sometimes that just means collecting data differently. For example, when you are buying something and you are asked if you buying this for your wife or for someone else.
I remember that my wife was buying something for a friend who was pregnant and for the next 12 months she was bombarded with pregnancy-related data. This is why they will ask if you want gift wrapping.
There are quite a few challenges there, but there are other interesting issues. Honesty is one. In the past we relied much more on people telling us things so you would go to your GP and he would ask you how many units of alcohol you drank a week and everyone lied.
Now, with a lot of the technologies, you can collect this information automatically. There are sensors that can detect alcohol levels in your body. I see this as a step forward because you no longer have to rely on people telling you something. How often they used something, what kind of films they like and so on, because the data is already there.
Q: How do you get past the fear factor?
For me this is one of the biggest challenges that we are facing at the moment. I campaign for more transparency. With the companies I work with, what I advise them to do is to be 100 percent honest with their customers. They should tell them what data they are collecting, how they are using it and then, hopefully, giving them some benefit like using fitness trackers or something like that. I am happy to give this information if it is giving me some fantastic insight and so I don't mind giving this data to be analysed for research purposes.
This is why there was so much uproar about the NHS hospitals sharing information with Google. People thought that if Google is sharing all this hospital information then Google will know too much about me. They didn't make it clear that it was just one little sub-section of Google and it was using different algorithms and it was stored some distance away from Google itself.
So I think lots of companies are getting it badly wrong by not making it explicit, what they are collecting and what they are using it for. The vast majority of people that I know are not aware that they have signed something and agreed to this.
I think this will be partly addressed by new privacy legislation that is coming in next year; the regulations will be that companies have to be more explicit. They have to get consent from people so that people can be aware of what is being collected and how it is being used. If you start using it for a different purpose, then you will have to ask for permission again.
And people will have the right to be forgotten, which is new too.
Q: What is the most imaginative thing you have seen big data being used for?
What fascinates me is combining big data with machine learning and especially natural language processing, where computers do the analysis by themselves to find things like new disease patterns, to find them in the data.
I find all that fascinating but another area is the ability of computers to read emotions -- combining the sensors with machine learning and with data algorithms to do what we always believed were very human skills.
Q: What makes big data such an interesting field for you?
I have never seen anything in my lifetime like this. This will transform every single job in the world. I believe that big data, together with all the other traits that we are seeing, is doing just that. I am scared and it fascinates me all at the same time.
I think that if you look at every job, and you can break down most jobs into individual tasks, between 50 and 80 percent of all jobs can be broken down into tasks that computers can do better than people.
And this is scary. Does it mean that we are going to lose between 50 and 80 percent of the jobs? I think all the way through the industrial revolution, we have always been good at finding other [jobs for people to do].
There will be disruption but between the next 15 or 20 years there will be a lot of jobs still available as long as we concentrate on what humans do well.
I am not exactly sure how this effects the whole business market. Now everybody has to work to earn money, so if suddenly a lot of people won't be able to do that we will have to rethink our society.
Q: Does that mean inequality?
A lot of these big companies -- companies like Google, Amazon, and so on -- the people who run those companies will become richer and richer and most of the others will not be able to participate in wealth generation. So we will need to find different ways to distribute money.