X
Tech

Bechtolsheim's Galaxy of servers

Q&A How do you make your server stand out from the masses? That's the challenge for Sun's Andy Bechtolsheim.
Written by Stephen Shankland, Contributor
Over a long and distinguished career, Andy Bechtolsheim has earned a reputation as a top-notch engineer. Now that reputation will be put to the test.

The task: Invent Sun Microsystems' next "hot box" out of the comparatively ordinary components of the x86 server market.

That's no mean feat. The server market for machines built with x86 processors, such as Intel's Xeon or, in Sun's case, Advanced Micro Devices' Opteron, is enjoying rapid growth, but it's hard to make one x86 system stand out above the crowd. What's more, Sun is entering a market full of rivals that have years of design and sales experience already under their belt.

The 49-year-old Sun co-founder nonetheless is confident Opteron will help Sun gain that edge. "If the world had not changed with Opteron, then Intel would still be building 32-bit (x86) chips, and it would have been too late for Sun to enter this market," Bechtolsheim said.

In 1995, Bechtolsheim left Sun to start Granite Systems, which developed 1-gigabit-per-second Ethernet networking equipment. In 1996 Cisco Systems acquired the company for $220 million. Seven years later, Bechtolsheim founded Kealia to build special-purpose servers for handling video. Last year Sun bought the start-up with the idea of migrating the Kealia technology to mainstream servers.

Sun hasn't been afraid to raise expectations for the Galaxy line of Opteron servers that now are on sale. "He's the most prolific and exciting and talented workstation and single-board computer designer on the planet," Chief Executive Scott McNealy said of Bechtolsheim when he announced the Kealia acquisition. "With this guy...designing Opteron servers, there ain't going to be nobody who has the class and breadth of computers we have."

Bechtolsheim spoke with CNET News.com about the difficulties Sun has encountered during its embrace of x86 servers, as well as the future of the Galaxy line.

Q: How is Sun different now than it was when you left in 1995?
Bechtolsheim: Well, the funny thing is that one of the suggestions I had when I was leaving was maybe Sun should consider building an Intel-type product line just to make sure that we had that part of the market covered. But Sparc was doing really well back then, and nobody had any interest in that. Now I've ended up doing what I proposed the company should be doing 10 years ago.

The point is that the company got a little too religious the last many years, prior to me coming back. It's a lot less religious these days. We obviously made a deal with Microsoft where we want to work with them to make life easier for customers to bridge the Solaris and the Microsoft operating environments. We're actually working with Microsoft on their services for management. And we now support the full range of operating systems, every version of Linux.

What did you do at Sun in your first career there? What led you out of the company, and then what led you back?
Bechtolsheim: Well, personally, I'm always driven by opportunities. We started Sun around the workstation opportunity (from) the work I did at Stanford--that's where the name came from, the Stanford University Network. Then Sun evolved into a server company, which was another great opportunity with the whole Sparc (processor) direction. In 1995, I saw an opportunity around changing the networking speed from 100 megabits to a gigabit. That got me very excited, so I left Sun to pursue that. I ended up being acquired by Cisco for a lot of money. I stayed there for the next seven years. The Cisco Catalyst 4000 and 4500 series was the product line that my group delivered to the market. It became the world's highest-volume modular chassis switch--I think they shipped over 50 million Ethernet ports.
We have nothing against Intel.

I got a little restless there a few years back, and I looked at some opportunities around media servers. This was when the Opteron architecture got announced by AMD and it was obvious to me that this architecture would make a significant difference in the market. Now you can't really start a start-up these days to be a server company. It's a little too late. The last two server companies to enter the market were Sun and Dell. It's really hard to enter this business on a grand scale. And so, as a start-up, I was looking at a video market as a vertical market segment opportunity.

When Sun announced that it was going to do Opteron servers, we connected and said we had all this Opteron stuff under development, and would love to do more of that at Sun. We very quickly came up with the deal that brought me back to Sun. Combining the team I brought with me with the existing people at Sun was really the first time Sun made an internal design commitment to industry standard (x86) architecture. Since my return in April last year, we've been very busy working--not just on the systems we're announcing next week, but on a whole bunch of systems that are not yet ready for announcement. All are based around the Opteron architecture.

Our original goal was to deliver a complete family of Opteron capability to the market. The first two members we're announcing are the 1U and 2U boxes (rack-mountable systems 1.75 inches and 3.5 inches thick), which is obviously the highest volume part of the market. In many ways the other systems are more interesting, but we can't talk about them today.

Yeah, I am very curious about, in particular, the eight-processor server.
Bechtolsheim: The company's public that it's working on systems up to eight-way. Obviously we are working on blade servers. But I just can't give you any more specific details on these systems.

So what led you back to Sun is that you wanted to do something like a server start-up, but not just for the vertical markets?
Bechtolsheim: AMD with Opteron is showing a lot of leadership doing the right things for the market and for customers. Building a new business at Sun around that was quite an interesting opportunity and really appealing to me personally. If the world had not changed with Opteron, then Intel would still be building 32-bit (x86) chips. It would have been too late for Sun to enter this market. You can't add any value to a market that doesn't change very much. People used to think with the Wintel duopoly--Windows and Intel--everything was quite stable and nothing would ever change. But now you've got AMD Opteron delivering the best server chip and you've got OpenSolaris and all the Linux stuff, so there's more competition in the industry-standard server space now than there has been in years. That makes life more interesting because it also means there's more innovation.

How much of the Kealia work carried on directly to the Galaxy designs?
Bechtolsheim: The more specific servers that were more media-targeted are actually not being announced next week. I know that they will soon come out. From a time-to-market standpoint, we focused on the 1U and 2U systems, which are the highest volume in the rack-mount server market. They will drive the revenue, quite frankly. From an economic or business impact, they will have the highest impact on Sun.

When Sun started getting into the x86 market, McNealy said these designs are undistinguished---not exactly a dime a dozen, but he said, you just rolled over to Taiwan and put out some bids. Did you run into any resistance that these things, in fact, require a lot of engineering?
Bechtolsheim: Well, quite frankly, one of the misunderstandings of the company at that time was that you could just fly to Taiwan and pick up these boxes. IBM, HP and Dell have been getting (their systems) made in Taiwan or China. But they all specify these things very tightly internally. You cannot buy the Dell box in Taiwan, you can only buy the Dell box from Dell. We had to change the mental model. Success in this market isn't free. We can't just rebadge or relabel third-party systems and deliver value to customers.

Whose x86 market share are you eyeing?
Bechtolsheim: As of last quarter, we actually are now the sixth largest x86 server in the world. We are hoping to move up to No. 4 here in the next calendar year. We are about to pass some people that you wouldn't consider mainstream, but that still are large vendors in their certain geographies. So next year, there are going to be four tier-one x86 server companies.

How do you differentiate yourself in the world of x86 servers? What lets your hardware design stand out above those from IBM or Dell or Hewlett-Packard?
Bechtolsheim: We worked with AMD to bring a higher performance version of the dual-core chip to market. So, this is actually a higher-power version, a 120-watt chip that they especially made for us. I guess they announced that they are going to also sell it to other people, but we're the only company that's going to actually ship this chip for a while into the market. This gives us one speed grade higher than what's available from anybody else.

Now I've ended up doing what I proposed the company should be doing 10 years ago.

Everybody wants lower power, but people also want higher performance. The cost of software in most cases is higher than the cost of the hardware, so you actually want the fastest throughput server to minimize the software investment and also the number of machines you have to manage. One disadvantage of dual-core so far was that the clock rate, quite frankly, was lower than the single core. When you put two cores on a chip, the power does go up if you want to keep the old clock rate. So with this higher power chip that we have in both the Galaxy boxes, and that we'll support in all our future boxes, we can bridge the performance gap between dual-core and single-core to the point where we now have truly the world's highest throughput 1U and 2U enterprise boxes.

So you have a marginally faster processor...
Bechtolsheim: Then we have power efficiency. Power efficiency is becoming probably the second most important criteria for most enterprises. They're running out of power and cooling in the data centers, and there's no justification to waste power on less efficient CPUs. This is not a stab against Intel here--they talked about power extensively at the Intel Developer Forum a few weeks ago--but they don't have a power-equivalent solution for probably another year. If power is a criteria for the customer, they really ought to consider Opteron, not Xeon.

Now another thing we did is bundled in a complete ILO (integrated lights-out) management processor; we have a full keyboard-video-mouse and storage (KVMS) emulation, meaning you can do any management action over the Web without having to buy third-party solutions. That's almost a prerequisite in the enterprise world because people can't afford to have a system administrator in every remote branch office. You can completely reboot the server or put in the latest Microsoft security bug-fix.

Galaxies are full enterprise-class chassis. You can hot-swap fans and you don't have to bring the system down, and the same with power. We have RAID (storage controller) on the motherboard. We used the 2-1/2 inch SAS (Serial Attached SCSI) drives, which are less power hungry and smaller than the older 3-1/2 inch disks. These are really next-generation chassis that we plan to keep for a long time, upgrading them to whatever the latest CPU clock rates are. There wasn't a chassis in the market that had the power, the cooling and the redundancy features that we needed to compete.

I see those features as interesting, but not something that's an unbeatable lead over the competition.
Bechtolsheim: No, the truth is these features exist in the Intel market today. What's interesting to us is that most of our competitors have not made that kind of commitment to AMD Opteron. Obviously Dell doesn't sell Opteron at all. IBM has one low-end Opteron system for technical computing, but they don't have enterprise-class Opteron systems. HP has made the broadest commitment, but they don't have a 1U enterprise Opteron server either, and 1U is the highest (volume) one in the market.

What would it take to get you to bring Intel processors into your systems?
Bechtolsheim: We have nothing against Intel. What we told Intel is if they have a better chip, we'll use it. It's just we haven't seen a better chip yet. They have to get absolute performance, cost/performance and power/performance. The real advance of Opteron was putting the memory controller on the CPU. That cut the latency to memory by half or more over Intel. You cannot hide the memory latency for real applications. They do have cache misses. They do go to memory. Right now, when Intel stops and needs something from memory, you sit there for hundreds and hundreds of clock cycles waiting for that memory access. It's not a good way of spending your CPU time. In in AMD's case it's just half the latency. I can't speculate in public here when Intel will have a memory controller on the chip, but...

2007.
Bechtolsheim: Exactly.

Now the other part is the HyperTransport interconnect that gives AMD a much more scalable story. We're not launching this product next week, but obviously one key attraction of Opteron was that it scales up to eight (processor) sockets, and Intel does not have that either. They are talking about more scalable things later on, but not in the near future.

Does it require a whole new chipset to do an eight-socket server, or is that something you can just rely on HyperTransport to do?
Bechtolsheim: HyperTransport interconnects the CPUs. There's no other chip investments we had to make to bring these systems to market. So the natural ability of Opteron goes up to 8-way, correct. Whereas in the Intel case, IBM spent $100 million in an effort to do a chipset to make the Xeon MP more scalable.

Do you think Itanium is going to catch on widely?
Bechtolsheim: They shipped about 5,500 systems last quarter. If you look at the IDC numbers, if you get beyond the Intel hype and look at the actual number of systems shipped, they're so small that we can't figure out why anybody is bothering with this. It's just amazing how few of the systems are shipping. What's happening is they're losing the software support. A few years ago, this was projected to be the next $20 billion market, and all the software vendors said, "If that's the case, then we will support it," but now nobody is making any money, so the software vendors are investing where they can make money. The industry is consolidating on (64-bit x86 servers).

Editorial standards