One of the concerns expressed by both users and experts attending Cloud Computing Congress in London this week was the risk of data being exposed to third parties in a multi-tenant environment. There seems to be a lot of confusion on the matter, so I thought it would be useful to blog a quick overview that may be helpful for people evaluating whether to go multi-tenant.
Intuitively, we feel that if our data is physically on the same computer system — or, in a fully multi-tenant stack, actually in the same database — then there has to be a higher risk of data being exposed. Either inadvertently, when for example a software bug or system mulfunction gives access to a user of another system on the same shared infrastructure. Or maliciously, when someone exploits some weakness in the architecture to gain illicit access to data.
In theory, there is some truth in this intuition. But in practice, it depends what level of multi-tenancy we're talking about and how rigorously it has been architected. The theoretical comparison assumes the same security regime in both cases, whereas in real life, the provider of a multi-tenant service is going to put a lot of expertise and resource into making sure its infrastructure is as secure as possible against this kind of data exposure, which would be very bad for its reputation. Most multi-tenant systems are operated to much higher security standards than standalone systems. Look at it this way: in theory, a single house with a fence around it is much more secure than an apartment in a block shared with many other households. In practice, the householders in the apartment block will pool the cost of having a porter on duty 24x7 to control access to the building and monitor security.
There are two main risks to be aware of, depending on what type of infrastructure you're looking at. The first risk applies to a virtualized infrastructure, where a single physical machine hosts many separate virtual machines. There is a theoretical risk that one of the machines in this kind of setup could monitor what its neighbours are doing, burrowing into the underlying infrastructure to bypass security implemented at the software layer. I'm not aware that anyone has shown they've been able to do this in a commercial cloud provider environment, but in theory the risk applies to anything from an infrastructure-as-a-service provider such as Amazon EC2, all the way up to a SaaS provider who is keeping customer data in separate virtual databases.
Some Gartner research that's been publicized this week will fuel the anxieties of those who aren't yet ready to trust multi-tenant clouds, but in fact the detail of the findings bears out what I've said about security measures. Gartner found that 60 percent of virtualized servers will be less secure than the physical servers they replace. But this is not because virtualization is inherently insecure, says Gartner's Neil MacDonald. It's because the people implementing this new, unfamiliar, technology aren't doing it right. "Most virtualized workloads are being deployed insecurely. The latter is a result of the immaturity of tools and processes and the limited training of staff, resellers and consultants," he explains.
Gartner provides a list of six risk factors that it says should be addressed. I'm sure that most cloud providers will already be on the right side of all these risk factors. It is internal enterprise virtualization projects that are neglecting them (some of which, please note, apply to projects that host virtualized servers on cloud infrastructure, but are still about user best practice rather than the provider's infrastructure itself).
Risk number two is the risk that your data will inadvertently get exposed to other users, due to poor implementation of the access management process or some kind of software bug. People are most conscious of this risk in a multi-tenant database, where every customer's data is stored in the same tables, but it also applies where only the application code is shared, since a simple slip could result in redirection to the wrong database. If there's a vulnerability, it could be maliciously exploited, but most of these episodes are cases when a user logs into their system as normal and discovers they're looking at someone else's data. The best known cases of this have been in the online banking world (which hasn't stopped people using online banking, by the way).
The fact that, in theory, there's a greater inherent risk of this happening means that, in practice, providers go to great lengths to ensure that it never does. It is a very simple matter to flag data as belonging to a specific customer and then make sure that flag is always respected when reading data. Providers build and test their software to design out the risk of these data leakages.
It comes down to trust and confidence. Knowing these risks, do you believe your provider will have done what's necessary to prevent them occurring? It's also important to weigh up the risks your data is exposed to if you don't use a cloud provider. How secure is it kept on-premise or in a third-party hosting center under your own control? There's a tendency to distrust multi-tenancy simply because it's new and less well understood (and requires us to trust a third-party provider), but we too readily forget the shortcomings of more familiar environments.
One final consideration to bear in mind is the law. There may be types of regulated data that, because the law was drafted before virtualization became commonplace, forbid the hosting of data on a shared infrastructure. Unfortunately, the only way to get round this — even though the unintended effect of following the law may be, paradoxically, to make the data less secure — is to get the law changed.