Redis wants more than cache

Redis may be ubiquitous as a persistent caching tier, but the company behind it wants you to think about it as an operational database that is extensible.

Demand for data science talent continues to grow Pay rates for data science jobs and trends in programming languages are revealed in a new report.

Redis wants your respect, and in so doing, it wants to be known for more than its most common use case: caching. As an open-source data platform, Redis has grown almost ubiquitous as the real-time database that sits between the "real" database that your organization is using for transaction processing or content serving. It's so ubiquitous, thanks to being a compact engine that is available as open-source, that it often seems to blend in with the woodwork. No wonder it's ranked eight in popularity up on db-Engines.

And it answers a common need: providing that buffer that cushions your back-end transaction or content serving systems by offloading the burden of real-time response. That's been a crying need ever since the web got opened to transaction processing and existing database platforms simply could not handle internet-scale loads.

But before Redis, in-memory data stores could not persist, replicate, or manage schema. This was the era of proprietary in-memory data grids (and later, open-source caching technologies) that temporarily stored data as objects and relied on application code to manage transaction state. The conventional wisdom was that real databases, that would handle transactions, couldn't scale or that memory was too expensive.

Redis Labs, the company, began life under other auspices and originally had a business supporting Memcached before it saw evidence of how Redis scaled. According to CTO and cofounder Yiftach Shoolman, it was a risqué website that drew attention to Redis's scaling.

The company pivoted over to supporting Redis, which today is offered through an open core-like business model. The Redis database is fully open source, but enterprise features and Redis Modules are licensed separately (more about that below).

Redis Labs initially offered Redis only through the cloud but then added an on-premises offering. This offering, Redis Enterprise, encompasses the core database plus capabilities such as multi-cluster support, ACID transactions, automated failover, storage tiering, and security features. Extending the core Redis database is where Redis Modules come in. Comprised of a mix of SDKs developed by Redis Labs and the community, the guiding notion behind the modules has been to prevent scope creep on the core Redis database, an issue that Redis creator Salvatore Sanfilippo is passionate about. Modules make Redis extensible by adding SDKs for supporting new data types such as JSON, graph, or time-series data; or capabilities such as search, neural network processing, SQL query, or bloom filtering.

We can't avoid the licensing story here. For modules that Redis Labs developed itself, the company went through its share of licensing weirdness last year before settling on the Redis Source Available License that lets developers play with the code, but in essence, prevents them from monetizing it (whether there are real exceptions is the confusing part). We pine for the clarity of open core, but that's another story.

The modules expand Redis from its key-value store roots to a multi-model database, thanks to supporting a variety of data types. In a way, the approach mimics that of Cosmos DB in that data is stored in a canonical storage engine but exposed in different views. The prime difference is that, with Cosmos DB, once you commit to a specific API for a data set, such as graph or MongoDB-compatible JSON document, your data is only exposed in that model.

But since Redis stores data in memory, or with new data tiering capabilities, SSD or Intel Persistent Memory (Optane), it doesn't have that restriction. That's possible because, with data stored in memory, the data view can be exposed on the fly. That's very much in keeping with other in-memory databases such as SAP HANA that, for the same reason, can eliminate the need for persisting materialized views.

And so those capabilities allow interaction of time series data with a search engine or a SQL query. Coordinating that is another feature of Redis Enterprise, RedisGears, a utility for building and orchestrating real-time data pipelines. In this case, the model would use Redis Streams to provide a change data capture feed that could interact with each of these data types in real-time.

That's prompting adventurous Redis customers to go beyond the cache. At a recent Redis Day session for developers in New York, Redis Labs showcased several customers developing out of the box. Fiserv, which initially used Redis Enterprise for caching, provided a textbook example. With the introduction of active-active (masterless) bi-directional replication a couple years ago using Conflict-Free Replicated Data Types (CRDT), Fiserv wanted to take advantage of the new multi-master capability to eliminate the need for manual copy processes to create replicas. That would provide the advantage of local processing with the "strong eventual consistency" that CRDT brings. But of course, there are always complications that, typically, involve small things, like ensuring that all the right ports are open to allow the constant flow of change streams between replicas (that didn't necessarily come naturally to the financial services firm, whose default security mode tended to close down most ports).

With the extensions, the Redis folks are broadening their sights. They are providing benchmarks showing Redis doesn't have to be considered a small database anymore. There is a new focus on time series.

Redis' broader aspirations come as, not surprisingly, other platforms, such as Amazon DynamoDB Accelerator (DAX) are encroaching on its turf. As memory prices have declined, it is increasingly common for databases to add in-memory options. Redis is spreading its wings, not only by supporting a broader array of data types, but also adding tiering to cheaper forms of storage. That's a reality that other in-memory databases like SAP HANA are embracing. When it comes to multimaster, Redis is not alone either. It compares itself to Apache Cassandra (which has tunable consistency) and Google Cloud Spanner (which emphasizes strong consistency). While we don't expect Redis to become a general-purpose transaction database, we could foresee use cases where the real-time envelope is expanded.