Are Salesforce.com's problems due to big iron?

A rival CEO raises the intriguing possibility that Salesforce.com's current outage embarrassments are down to its choice of powerful Sun servers over a grid of smaller machines.
Written by Phil Wainewright, Contributor on

An email from NetSuite CEO Zach Nelson raises the intriguing possibility that Salesforce.com's current outage embarrassments are down to its choice of powerful Sun servers as a hardware platform rather than a grid of smaller machines.

"Salesforce.com made a decision very early on to deliver on 'big iron'," he writes."If we ever have a problem, only a fraction of customers [are] impacted" "They chose to deploy on very large Sun servers running Solaris and Oracle.  The result is that they have many customers (maybe thousands?) running on a single database.  Guess what, when that system — the database, the operating system, the disks — fail, thousands of users and customers are impacted."

This is in contrast to NetSuite's parallel choice, he continues:

"At about the same time as Salesforce.com chose big iron, NetSuite made a different architectural choice. We chose a more grid-like deployment architecture. Specifically our server farm is made up of 2 CPU HP systems running Red Hat and Oracle. We have roughly 50 customers on each box (or we can put a single customer on a box, but that's a story for a different topic). So if we ever have a problem, only a very small fraction of our customer base is impacted. As well, such architecture allows us to detect issues long before they effect a large number of users on our other 100+ servers. "

Another advantage of this arrangement is the ability to do a phased roll-out of any upgrade to the application, he adds:

"We call our roll-out process for new releases 'Phased Release' and we've actually patented how we do it. We've been using Phased Release for more than 2 years now, and I'm pretty sure we are the only SaaS vendor that does. [Editor's note: Many other SaaS vendors have similar processes for phasing the roll-out of new releases. Only the implementation detail differs from what Nelson describes.]

"We phase all releases, including bug fixes, but for major releases we really take our time. For example, with our upcoming 11.0 Release we'll upgrade our customers to it during the course of a quarter, roughly as follows:

    • Beta Test — About one month of availability to customers on test servers
    • Live Release
      — Phased Release to all customers
      — NetSuite goes on it first (1 month)
      — 'Early adopters' go on it next (about 2 weeks). Early adopters are those customers who express interest in getting the new features, usually about 10% of the customer base
      — Remaining 90% of the customers receive the upgrade in a series of phases. About 15% of the remaining customers are upgraded during each successive weekend evening until all customers have been migrated."
Editorial standards