Big Data's 2017: Can more meta thinking free us from current malaise?

Here's a roundup of annual predictions from around the big data industry. Key themes include artificial intelligence/machine learning, cloud adoption, demand for data scientists, and the growing importance of the Internet of Things.
Written by Andrew Brust, Contributor

VIDEO: How Big Data will change your life in 2017

Every year, a wide array of big data companies send me their executives' predictions for the upcoming year. It's fun to compile them, then read through them, and see what everyone has to say. There's fodder for heckling as some of the predictions are in outright contradiction of each other, and there's ample opportunity for over-confident analysis when two or more predictions corroborate each other.

Extracting insights
What I prefer to do, though, is look for themes that tie the predictions together and provide some taxonomy for understanding the full range of them. That way, even those items that seem contradictory at a superficial level can together shed light at a deeper level, and some consensus can emerge. And even if the consensus predictions may end up being incorrect, we nonetheless get a composite view of what leaders in the industry are thinking and what trends will likely result.

This year, the predictions speak to transition in the industry, including the move from pure-play analytics to more applied scenarios, especially with regard to the Internet of Things, artificial intelligence, and machine learning use cases. The question of siloed skill sets and technology versus an integrated approach comes up, as does the notion of applying analytics technologies to analytics itself.

AI's not new
Let's start with artificial intelligence (AI), which John Schroeder, executive chairman and founder of MapR Technologies, points out is "back in vogue." While AI is a contemporary force, Schroeder rightly points out that "in the 1960s, Ray Solomonoff laid the foundations of a mathematical theory of AI" and that "in 1980 the First National Conference of the American Association for Artificial Intelligence (AAAI) was held at Stanford and marked the application of theories in software."

What's different now, though, is that the data volumes are in fact much bigger, which means the models are better trained and more accurate; the algorithms are better too, and customer interest is orders of magnitude ahead of where it was 30 years ago.

Maybe that's why Rick Fitz, SVP of IT markets at Splunk, says that 2017 will be the year that "analytics go mainstream" adding that "more IT professionals and engineers [will rely] on emerging technologies like machine learning, automation, and predictive analytics to do higher level work behind the scenes."

Automation and employment
Does that mean that mean machines will take away jobs from humans -- an issue that is clearly a factor in today's political landscape? Joe Korngiebel, SVP of user experience at Workday sees it this way: "No, the machines are not taking over, but we are at a critical inflection point."

What does that really mean? Vishal Awasthi, chief technology officer at SAP partner Dolphin Enterprise Solutions Corporation foresees "elimination of routine tasks that can be delegated to...bots," but Awasthi tempers this by saying that workers formerly focused on such tasks could "transform their roles in to knowledge workers that focus on tasks that can still only be done based on the general intelligence."

Maybe that underlies why the folks at internet jobs site Indeed.com told me "we're now seeing data job postings go up after more than a year of decline." Plus, the growing reliance on automation creates demand for data security professionals. Indeed.com says "data security remains top of mind for businesses and we've seen hockey stick growth since 2013."

Machine learning, uber alles
That observation would seem to corroborate Pentaho CEO Quentin Gallivan's prediction that "Cybersecurity will be the most prominent big data use case." Gallivan is also in accord with the machine learning camp, saying "2017's early adopters of AI and machine learning in analytics will gain a huge first-mover advantage in the digitalization of business."

Gallivan doesn't see this as limited to a few specific use cases, either, adding "this is just as true for the online retailer wanting to offer better recommendations to customers, for large industrial customers wanting to minimize large maintenance costs, for self-driving car manufacturers or an airport seeking to prevent the next terrorist attack."

AI, inside
So where will all this AI smarts live? Toufic Boubez, VP of engineering at Splunk, foresees "the appification of machine learning," explaining that "machine learning capabilities will start infiltrating enterprise applications, and advanced applications will provide suggestions -- if not answers -- and provide intelligent workflows based on data and real-time user feedback." Boubez continues by saying "this will allow business experts to benefit from customized machine learning without having to be machine learning experts."

That line of thinking is amplified by Redis Labs' VP of product marketing, Leena Joshi, who believes that "Enterprise applications that 'learn' quickly and customize user experiences will be the new norm for success." Similarly, Basho's CEO, Adam Wray, believes "organizations will begin to shift the bulk of their investments to implementing solutions that enable data to be utilized where it's generated and where business process occur -- at the edge." And while that "edge" may refer to physical devices, it can also refer to line-of-business applications.

Data scientists go big, or go home?
The embedding of intelligence in mainstream enterprise software alludes to another key theme in this year's predictions: whether and to what extent we will need data scientists.

Oliver Robinson, director at World Programming, says "education courses geared toward data science careers will increase in popularity..." and adds that "this will help meet the growing demand for data scientists/specialists in the job field." Jeff Catlin, CEO of leading NLP and sentiment analytics provider Lexalytics says "2017 will be the 'Year of the Data Scientist,'" but further predicts that 2018 "is when AI will become buildable...by non-data scientists."

Perhaps going further, the folks at DataStax say that "the term, 'Data Scientist' will become less relevant, and will be replaced by 'Data Engineers.'" And even World Programming's Robinson allows for a relaxation of the data scientist title, saying "Machine learning and artificial intelligence will also drive up the need for new types of data specialists."

Regardless of whether we need specialists, scientists, or just greater analytics literacy among workers of all stripes, the question of where the analytics will get done arises. Despite years of hype around the cloud, it has seemed like most big data activity has been on Hadoop clusters that have been installed on-premises. But many in our prediction faculty insist that the cloud is where things are headed, and that even the most conservative of organizations will adopt at least a hybrid approach.

Kunal Agarwal, CEO of Unravel Data, predicts that "in 2017 we will see more Big Data workloads moving to the cloud, while a large number of customers who traditionally have run their operations on-premises will move to a hybrid cloud/on-premises model." Dan Sommer, Qlik's senior director of market intelligence, believes that "because of where data is generated, ease of getting started, and its ability to scale, we're now seeing an accelerated move to cloud." And Snowflake Computing's CEO, Bob Muglia, says that "almost every company, including most financial services, is now committed to adopting the public cloud."

Eric Mizell, vice president, global solutions engineering at Kinetica sees a convergence of the cloud's popularity and the boon to machine learning brought about by graphics processing units (GPUs), stating "the cloud will get 'turbo-charged' performance with GPUs," pointing out that "Amazon has already begun deploying GPUs, and Microsoft and Google have announced plans" and predicting that "other cloud service providers can also be expected to begin deploying GPUs in 2017."

IoT, in the place to be
Much of the need for this extra processing power comes from the sheer volume of input data collected from sensors embedded in devices designed for Internet of Things (IoT) applications.

The braintrust at Analysys Mason Group believes that "the first truly commercial NB [narrow band]-IoT networks will be launched" in 2017 and that "regulators will consider increased oversight of IoT."

StreamSets CEO and founder, Girish Pancha, believes that in 2017, IoT will stop being the shiny new thing and will start getting real, saying "next year, organizations will stop putting IoT data on a pedestal, or, if you like, in a silo." In other words, IoT must be integrated with the rest of the data lifecycle in organizations. Pancha elaborates: "IoT data needs to be correlated with other data streams, tied to historical or master data or run through artificial intelligence algorithms in order to provide business-driving value."

On that note, CEO Andy Dearing, CEO of Boundless, a company focused on geographical information systems (GIS) technology, believes that correlating IoT streams with location-based data will be of utmost importance. Dearing predicts that "location-based analytics and platforms that can process and detect trends and provide intelligence will emerge as a popular trend."

Adding specificity to this proclamation, Dearing says that "with self-driving cars and smart cities initiatives becoming more of a reality, it will be imperative to understand how all the location information can be used to make smarter decisions."

Endless loop?
In order to derive such value from IoT data, companies will need competency in processing streaming data. Anand Venugopal, head of product at for StreamAnalytix at Impetus Technologies says "enterprises leveraging the power of real-time streaming analytics will become more sensitive, agile and gain a better understanding of their customers' needs and habits to provide an overall better experience."

The problem is that mere ingest of the data is not sufficient, because that data needs to be cleansed and otherwise prepared. Doing that takes a lot of work, which means the potential for new technology breakthroughs is tempered with old technology realities. Does that make analytics a zero sum game?

Break on through
Pentaho's Gallivan doesn't think so, believing instead that the breakthroughs can solve the old problems rather than having to be bogged down by them. His take: "IoT's adoption and convergence with big data will make automated data onboarding a requirement." Put another way, machine learning models will need to be harnessed to prepare data before it's used to train other machine learning models.

That's a pretty transcendent analysis and, with hindsight, pretty straightforward. After all, how can we say with a straight face that our technology is game changing if we don't apply it to changing our own game?

And the world will be a better place
The big data and analytics world got a little stuck in 2016. Hopefully, 2017 will engender more of the "meta" thinking Gallivan has employed here. That would seem to be a ticket out of the current malaise and into the next stage of productivity, and bona fide value, for customers.

Armed with that optimism, we can see how IoT-driven AI/machine learning, running on cloud-based GPUs, embedded in enterprise applications and being run by non-data scientists, can actually work. Let's hope 2017 is the year at least a chunk of that comes to pass.

Building customer loyalty is all about data:

Editorial standards