Big Data technology may be impressive all by itself, but just like anything else, it really needs a killer app. The good news: it's quite possible that Big Data now has a killer app, in the Internet of Things (IoT). The bad news: IoT work requires the use of streaming data technology, and the barrier to entry there can be very high indeed.
Most streaming data solutions have required imperative coding against the data in the stream, and have used a "message bus" approach to represent the data. That model has been used for a long time in the rarefied world of application integration. But most Enterprise developers have conventional SQL database skill sets, and aren't used to treating data as messages. So the killer app is here, but it has lacked a killer interface to make it accessible.
The Amazon Web Services (AWS) cloud computing platform has reflected this Catch-22. The combination of Kinesis Streams and Kinesis Client Library offer an imperative programming interface to streaming data. The Kinesis Firehose eases things a bit, as it can push data from a Kinesis Stream into Amazon S3, Redshift or ElasticSearch Service, from whence the data can be queried using a variety of familiar programming and BI tools.
Easy does it
But what if you want to get to the data right from the stream, in real time, and do it with a conventional SQL skill set? Kinesis Analytics now makes that possible.
The product's general availability was announced last week at Amazon's AWS Summit event in New York, and a breakout session at that event covered it in detail.
Also read: Where AWS is headed: Every function as a managed cloud service
Take a look at the recording of the session on YouTube, including the demo that begins at 43:07, and you'll see that Kinesis Analytics is reasonable to work with. By projecting a SQL syntax over data streams -- essentially representing them as database tables -- programmers with conventional skill sets can start to get busy with streaming data.
License to kill?
Kinesis Analytics is actually an OEM'd (read: licensed and adapted) implementation of a product called SQLstream, from a company of the same name. So it's not groundbreaking, but it is very tightly integrated with the other two Kinesis services.
It's also made very accessible from the AWS Management Console, where it can be easily connected to a demo stream of stock ticker data, allowing anyone with AWS account to get hands-on with it very quickly.
You're not that special, Amazon
As it happens, Amazon is playing catch up here. Not only has SQLStream been on the market for quite a while, but so has Azure Stream Analytics (ASA), from Amazon's cloud nemesis, Microsoft. ASA also uses a SQL-like query language on top of its own streams, supporting the same "tumbling window" and "sliding window" concepts that show up in Kinesis Analytics.
So the new AWS offering is less a giant step for streaming data analytics than it is a corroborating vote from Amazon on abstracting and querying streaming data as if it were data in a conventional database. In fact, if you take a look at Kafka Streams (supported by Confluent) and Structured Streaming in Apache Spark, you'll see support for the same paradigm. So AWS' contribution here is to bring unanimity of support for this approach.
Are we there yet? No.
There's still stuff missing though. If IoT and streaming are really going to be Big Data's killer app, we need more integration. Amazon has additional services for IoT and data visualization; why aren't those incorporated? While developers can compose these services on their own, that just leaves a lot of residual friction which, in turn, means killer app status is still elusive.
Right now, for successful IoT streaming analytics initiatives, a lot of assembly is still required. The eye of the needle may be bigger, but it still has to be threaded. We really need things to be pret a porter. Only then will Big Data have an app that's dressed to kill.
Tony Baer provided significant reporting for, and contributed analysis to, this post.