X
Tech

The missing link in the application stack

Network attached memory can help lower costs and accelerate the move to open source because it provides an easier model for availability and scale than even the most expensive, highest end app servers and frameworks, says Terracota founder Ari Zilka.
Written by Ari Zilka, Terracotta, Contributor
Commentary--In the last five years, companies deploying Java-based, application infrastructures have had an increasing number of alternative choices to the traditional proprietary J2EE stacks from Oracle, IBM and BEA. Technologies such as open-source application and Web servers, development frameworks, and object-relational mapping (ORM) solutions give any organization, regardless of size and budget, the ability to deploy applications based on light weight components and infrastructures.

One problem; while open source gives organizations flexible and affordable choices, it has not delivered solid support for mission-critical applications that need to be simultaneously highly available and scalable. Traditionally, it has been only the companies that can spend millions of dollars on proprietary stacks, or those who can afford to build their own stacks in house, that can support thousands of concurrent users, a requirement for any credible ebusiness.

That's until now. Open source technology in the application clustering area that integrates with existing application infrastructures, both proprietary and open source, and ensures highly available and scalable Java business applications is emerging. As the missing link in the open source application stack, these technologies are changing the dynamics in the J2EE market and represent a real threat to the hegemony of IBM, Oracle and BEA in the enterprise market.

If you're lucky enough (or unlucky, depending on how you look at it) to work on an application that sees a lot of traffic, you know the importance of getting that application to be highly available and scalable. Keeping your application running in the face of hardware failures, software upgrades, and increasing user traffic is no simple task.

The big challenge is always how to make your application state—session data, caches, indexes, etc.--available across application server instances so that you can add more servers to keep up with user requests as load increases and survive restarts of parts of your Web cluster without the end-user noticing. This is usually done in one of two ways: 1) write custom code to serialize that data out to the cluster, or 2) write that data out to your database. While these solutions can be made to work, they are inevitably difficult to work with and hard to scale.

Writing custom serialization code may seem easy to start with, but maintaining it can quickly turn into a nightmare, especially as your application grows and you have more and more state to manage. Keeping track of what has to be serialized and moved around the Web cluster can be a real pain and can really get in the way of adding new features and improving existing ones.

Dumping state into the database just pushes your scalability problems down to the database tier. As your traffic grows, you find that more and more of your database resources are being consumed taking care of application data that doesn’t really belong there. Sooner or later, someone’s going to tell you to stop abusing the database.

The solutions to this problem and the missing link in Enterprise Java development is a plug-in that provides high availability services to the JVM. To date, high availability (HA) has been the job of the developer. HA must come from a plug-in because the need for HA should not impact the programming language such that frameworks and developers have to manually account for it. It must also honor the current Java deployment model—scaled out copies of applications running on multiple hardware nodes where each node is autonomous.

The HA plug-in closes the gap of simplifying scaled out computing for Java somewhere other than in business logic (e.g., using messaging or object-relational mapping). We have lived without this plug-in since Java’s beginnings, and we have convinced ourselves of the necessity to solve "the EJB problem," meaning a better developer’s framework for building enterprise apps. But enterprise applications are not all messaging, or database, or SOA applications. They share only one common thread: high availability and high levels of scale are important in most enterprise programming.

The HA plug-in attaches to the Java machine and honors the programming language semantics while introducing the ability to make certain data resilient and sharable across Java processes running on one hardware machine or on many. The plug-in should be thought of as network attached memory (think NAS but for RAM). If application servers could work in such a shared "scratch space" they could collaborate and they could restart, all without a database, using nothing more than POJOs.

The HA plug-in can be compared to a scratch pad used during a test. For example, when taking a computer-graded multiple choice math test, where answers can be 'A,' 'B,' or 'C,' there is no notion of showing our work on the answer sheet. We might use scratch paper to do all our calculations and enter 'B' as our answer to a particular problem, but the teacher does not collect our scratch paper at the end of the test, only the answer sheet with the little circles filled in to represent answers. Similarly, an application may need to know our customers' billing zip codes but it does not need to know all the typos and changes the customer made along the way to enter that zip code properly. Most importantly, storing scratch data in expensive storage such as a database slows down the application tier significantly.

Network attached memory ensures that applications can attach at startup and work in a shared space. Network attached memory also provides the ability to pass work from one application server to another or to spread work amongst many nodes, thus allowing for a divide-and-conquer strategy. Network attached memory affords developers the luxury of viewing the JVM as larger than any physical machine while simultaneously allowing the operator to stripe the application across many machines running underneath a load balancer.

Organizations often assemble open source software to keep their applications nimble and lower cost with respect to license payments to big software vendors, but they find these applications sometimes do not scale very well, and hence are not as lightweight as expected from a total cost perspective. Why aren't lightweight stacks contributing to lighter weight applications? The answer is that without network attached memory the developer is storing the same "scratch work" in the databases, queues, or API-based caches that he used with expensive application servers, though maybe at reduced license cost with open source. Despite all of the capabilities of these open source products, developers still need network attached memory to move high availability out of code and into infrastructure.

Network attached memory lowers costs because developers stay focused on business logic and don’t need to write complex clustering code. Critically, databases and message queues no longer get overloaded with "scratch" data from a Java application cluster, allowing such applications to scale more cheaply. Network attached memory can help lower costs and accelerate the move to open source software because it provides an easier model for availability and scale than even the most expensive, highest end app servers and frameworks.

biography
Ari Zilka is the founder and CTO of Terracotta.

Editorial standards