Clouds based on open-source OpenStack software may not be as good at ingesting large amounts of data as those from Microsoft or Amazon, a study has found.
The study evaluated Amazon Web Services, Windows Azure and Rackspace Cloud Files, and was released by cloud storage specialist Nasuni on Wednesday.
It found that Rackspace's OpenStack-based service had the worst performance, as it took nearly a week to transfer 12TB of data out of the Amazon simple storage service (S3) and into Rackspace's cloud. This compared with 40 hours for moving 12TB from S3 to Windows Azure, and four hours for moving 12TB from one S3 storage 'bucket' to another.
"[Rackspace's] poor transfer-in performance gave rise to concerns within Nasuni about all the other clouds that are springing up based on OpenStack," Nasuni wrote (PDF). "It is hard for Nasuni’s engineers to imagine that these other clouds based on OpenStack would perform better than Rackspace’s Cloud Files, since Rackspace is OpenStack's premier reference implementation."
Nasuni tested the clouds in five ways, using a 12TB dataset of 22 million files of mixed sizes, with an average file of around 550KB. It transferred 200GB of this data between the three clouds and used that information as the basis for an estimate of long it would take to move 12TB. The results were:
- One Amazon S3 bucket to another S3 bucket: Four hours - Amazon S3 into Windows Azure: 40 hours - Amazon S3 to Rackspace: Just under a week - Microsoft Windows Azure to Amazon S3: Four hours - Rackspace to Amazon S3: Five hours
To control the data migration, Nasuni used an Amazon Elastic Compute Cloud (EC2) 'm1.large' instance. It believes Amazon's compute and storage clouds are sufficiently different to one another that it does not think choosing a compute node from a different cloud "would have much effect on the relative performance of each storage cloud to one another."
Nasuni found that the transfer-in for the clouds varied significantly, with Amazon peaking at 270MBps, Azure at 30MBps and Rackspace at 10MBps.
It discovered that Azure's performance varied significantly depending on the time of day, with the worst data transfer rates happening during normal business hours.
The company found that as it used more and more virtual machines to load data into Azure, error rates climbed beyond what it expected. "Perhaps there are challenges at the account or container level in Azure’s architecture," it wrote in the report.
Ultimately it concluded Amazon's performance was a factor of 10 better than its closest competitor — Windows Azure — with both services far ahead of Rackspace.
"Unfortunately, the [cloud service providers] are not very forthcoming about why their performance would vary so greatly," the company wrote. "Nasuni did not experience the same behaviour with Amazon S3, and this measurement probably further indicates limitations in Azure's architecture or bandwidth, as other customers using the system appear to be affecting our results to a large degree."