I'm a sucker for a tool that adds a layer of abstraction and corrals complexity into something manageable and more straightforward. As such, Impetus' StreamAnalytix product has been on my radar for some time, StreamAnalytix lets you build graphical data pipelines that blend the use of messaging platforms like Apache Kafka, streaming data platforms like Apache Storm, and predictive analytics technologies, like R and SAS.
But this morning, Impetus is taking things a step further, announcing the release of StreamAnalytix 2.0, which adds support for Apache Spark Streaming. We are finally beginning to see the triumph of sensible tools over platform in-fighting.
StreamAnalytix already lets data developers integrate a streaming data engine with Apache Kafka and RabbitMQ as publish/subscribe message buses. It also permits the integration of HDFS, Amazon S3, Apache HBase, Cassandra, Solr and ElasticSearch. All of this is done through a combination drag-and-drop visual programming and an array of declarative functions that let you do things like a code lookup in HBase or Cassandra, or even MySQL.
StreamAnalytix also supports versioning; a SQL syntax for common CEP (complex event processing) tasks and the ability to push real-time updates via WebSockets. It also allows you to process streaming data through your own Java functions (just give StreamAnalytix a class, entry point and parameter info) and it will take care of replicating your code across nodes in a cluster and executing it in a parallel processing configuration.
And if that weren't enough, StreamAnalytix includes its own dashboard authoring tools that permit the display of data that updates and changes in real-time.
One, the other, or both?
Selecting Storm or Spark in StreamAnalytix opens up a designer with both common and platform-specific functions to include in your pipeline. That means a given stream processing design is coupled to a specific engine. But since the pipelines themselves are merely persisted as JSON files, StreamAnalytix could one day even allow for the conversion of pipelines from one streaming engine to the next.
Folks from Impetus said that while this isn't on the roadmap in any official sense, that it's a scenario they've considered and one they see as a logical progression from where they are.
Will Impetus add support for other open streaming platforms, like Apache Flink? Will it add support for cloud services like Amazon Kinesis, Azure Stream Analytics or Google Cloud Dataflow?
Bear in mind that Impetus began life as a services pure-play. So while StreamAnalytix has brought the company into the product world, you can still expect it to triage development efforts based on what its customers want, need and request.
But whether or not support is added, the very fact that the architecture could support it means this platform is one you should make sure to evaluate. The industry needs more tools that bind and simplify the use of so many Big Data open source components. Check out StreamAnalytix 2.0 and you'll get a broad sense of what they should look like.