From Sun 1 to T5240

Back then John Walker's fbench ray tracing test took 63.07 seconds on a 6Mhz PC-AT and 8.43 seconds on the 16.67 Mhz Sun 3/160 - but today timing ten iterations of that test on either SPARC or x86 produces a user time estimate that falls below the lower limit for trustable results using the csh time function.
Written by Paul Murphy, Contributor
Right about today, twenty-five years ago, a bunch of geeks at Stanford were firing up their company's first production walk - a half dozen "Sun 1" multibus boards designed to run Unisoft Unix on MC68000s. Not much later the design stabilised using Bill Joy's first BSD derived SunOS and Motorola's 10Mhz MC68010 - which supported virtual memory and allowed the machine, right from the beginning, to include at least one on board ethernet connection.

My first Sun machine was a 3/160C with a 16.67Mhz MC68020/MC68881 CPU pair, 8MB, and dual 60MB external drives that arrived in January of 1986. For about $21,800 it came with a 19" color screen, ran a Unix largely indistinguishable from that on the Vax - and easily blew away a million dollar Vax 8350 on single user performance.

Today's most directly comparable Sun machine is probably their Ultra 40 M2, which despite the name is Opteron, not SPARC, based. Michael Schulman and Michael Burke did a nice review of it under the title "A server on every desk" in last month's Desktop Engineering in which they look at its use in advanced mechanical computer-aided engineering (MCAE) work.

Here's a key bit:


A number of interesting results were obtained when running this combined CAD and MCAE workload:


  • Running on its own, the OCUS benchmark ran in 1800 seconds.
  • The baseline for the MCAE benchmark was 4985 seconds. This was set up to use only one core in the four-core system.
  • Running the MCAE benchmarks using three cores, the elapsed time was 1644 seconds, or 3.03 times as fast as the single-core time (a superlinear speed increase).

This scaling occurs with many analysis programs. Some applications scale well with up to four processors, while others show good scaling to 64 or even 128. With current technology, workstations today can contain up to four cores, which matches up well with certain application scaling limits.

In the final test we ran the OCUS benchmark and the three-core MCAE application at the same time. The OCUS time increased to 1978 seconds, or about 10 percent. The MCAE time increased to 1691 seconds, or 2.8 percent.

Today, or next week actually, Sun is going to introduce some commemorative discounts without, I believe, announcing a 25th anniversary, UltraSPARC T2 based, workstation - and that's too bad because I'd sure like one and, in the longer run, that's the direction Sun has to go with its workstation products.

In the interim what I think they will do this week or next is announce a firm date for the second generation "Niagara" series - to be sold as the T50XX line- and then start talking about the hardware and software advances that go into their forthcoming "Rock" line - advances that will make using a coolthreads processor in a workstation look pretty smart.

In the short term, however, what we'll have isn't a workstation but a server - specifically, I'm guessing that a mid range T5240 - 16GB, dual 146GB FC disks, and one 1.4Ghz(?) UltraSPARC T2 with eight cores and full floating point will be available in July for about what my employer at the time paid for that workstation.

The obvious questions, therefore, are what changed over that period? how did Sun's progress compare to Wintel's? and did either of them beat Moore's law? -i.e. if you interpret Moore's law as predicting a rough doubling in processing power every 18 months, the 22 years since the 3/160 was introduced should have seen processing power increase by a factor of about 32,768. So did it?

Now you'd think this would be an easy question to answer, but it's not. Instead, the benchmark numbers I have just can't sensibly be compared - and if you can do better, please let me know because I suspect that the right answer would surprise most of us.

Thus it's interesting, but neither terribly enlightening nor remotely definitive, to compare results on Reinhold Weicker's V1.1 Dhrystone dhrystone test - in part because this version tested memory access back in 1985, but can now be done wholly in cache on most machines.

Nevertheless, we have those numbers: the 16.67Mhz 3/160 got a 2.4MIPS rating, the 6Mhz PC-AT a 0.69. A 296Mhz US2 (running Solaris 10!) gets 436 and Roy Longbottom's PC benchmark site records a score of 7,145 for last year's 2.4Ghz Intel Core duo. Extrapolate those numbers assuming that eight way chip level multi-threading only doubles throughput, and you get an estimated score of about 33,000 for the T5240 versus about 9000 for the 3.0 Ghz dual core Xeon. Given the inaccuracy of the guest-ti-mates involved this amounts to a wash: with the numbers suggesting an improvement factor of about 13,700 for Sun versus 13,000 for Intel.

Both of these fall well below the Mooore's law prediction - in fact only IBM's cell processor seems likely to beat Moore on floating point, but what's driving that discrepancy probably isn't hardware lag as much as it is that workloads, benchmarks, and expectations have all have changed significantly over the period - in effect, the workstation to server shift.

Back then John Walker's fbench ray tracing test took 63.07 seconds on a 6Mhz PC-AT and 8.43 seconds on the 16.67 Mhz Sun 3/160 - but today timing ten iterations of that test on either SPARC or x86 produces a user time estimate that falls below the lower limit for trustable results using the csh time function.

Back then, waiting for compiles or some floating point process was a real time killer but today, for most of us, the machines have long since become fast enough for what we do, making those things non issues. What we care about instead is stuff like web server throughput - and on that basis experience with the existing UltraSPARC T1 line suggests that the T5240 should provide roughly the throughput equivalent of a rackmount with three to six dual core Xeons depending on the specific software and user loads.

What we need, therefore, is real performance numbers from a web server run on both earlier machines. Since there is a Minix web server that works at least as far back as the 80386 and the 3/160 certainly had the memory and disk resources to support this, getting those numbers should be possible - for somebody. Unfortunately that somebody isn't me - so part of my reason for writing this was to ask for help: anybody know anybody with real numbers?

Or any other fair basis for comparing the 1985/6 and 2007/8 Sun generations?


Editorial standards