Snow Leopard geared for multicore future

Snow Leopard geared for multicore future

Summary: Mac OS X 10.6 begins a longer-term Apple attempt to get ahead by cracking a problem facing the entire computer industry: squeezing useful work out of modern processors.

SHARE:
18
Apple began shipping Snow Leopard on Friday, but the true importance of the Mac OS X update likely will emerge well afterward.

That's because Mac OS X 10.6 begins a longer-term Apple attempt to get ahead by cracking a problem facing the entire computer industry: squeezing useful work out of modern processors. Instead of stuffing Snow Leopard with immediately obvious new features, Apple is trying to adjust to the new reality in which processors can do many jobs simultaneously rather than one job fast.

See Also: Special Report: Snow Leopard

"We're trying to set a foundation for the future," said Wiley Hodges, director of Mac OS X marketing.

Apple shed some light on its project, called Grand Central Dispatch, at its Worldwide Developer Conference in June, but most real detail was shared only in with programmers sworn to secrecy. Now the company has begun talking more publicly about it and other deeper projects to take advantage of graphics chips and Intel's 64-bit processors.

The moves align Apple better with changes in computing. For years, chipmakers such as Intel and Advanced Micro Devices had steadily increased the clock rate of their processors, and programmers got accustomed to a performance boost with each new generation. But earlier this decade, problems derailed the gigahertz train.

First, chips often ended up merely twiddling their thumbs more because slower memory couldn't keep the chip fed with data. Worse, the chips required extraordinary amounts of power and produced corresponding amounts of hard-to-handle waste heat.

And so began the mainstream multicore era, in which processors got multiple computing engines called cores that work in parallel. That's great for some tasks that can be easily broken down into independent pieces, but programmers were accustomed to a more linear way of thinking where tasks execute in a series of sequential steps.

Enter Grand Central Dispatch, or GCD. This Snow Leopard component is designed to minimize many of the difficulties of parallel programming. It's easy to modify existing software to use GCD, Apple said, and the operating system handles complicated administrative chores so programmers don't have to.

Overall, Illuminata analyst Gordon Haff believes, the computing industry really is only beginning now to tackle parallel programming in earnest. If building mature parallel programming tools is a 10-chapter book, the industry is only at chapter two right now, he said. But with no other alternative, the book will be written.

"It has to happen," Haff said. "If you look at history of information technology, things that have to happen really do happen."

Burdensome threads
One way programmers have dealt with the arrival of multicore processors--and with the multiprocessor machines that preceded them--is through a concept called threads. There are various types, but generally speaking, a thread is an independent computing operation. For programmers to take advantage of multicore processor, they assign one thread to each core, and away they go, right?

Not so fast. Threads come with baggage. Each requires memory and time to start. Programs should be broken up into different numbers of threads depending on how many cores a processor offers. Programmers have to worry about "locking" issues, providing a mechanism to ensure one thread doesn't change data another thread is already using. And one threaded program might step on the toes of another running at the same time.

Some tools to ease the difficulties, such as Intel Threading Building Blocks, are available, but threads remain complicated.

"We looked at this and said it needs a fundamental rethink. We want to making developing applications for multicore easier," Hodges said. "We're moving responsibility for the management code into the operating system so application developers don't have to write and maintain it."

Blocking and tackling
The core mechanisms within GCD are blocks and queues. Programmers mark code chunks to convert them into blocks, then tells the application how to create the queue that governs how those blocks are actually run. Block execution can be tied to specific events--the arrival of network information, a change to a file, a mouse click.

Apple hopes programmers will like blocks' advantages: Older code can easily be retrofitted with blocks so programmers can try it without major re-engineering; they're lightweight and don't take up resources when they're not running; and they're flexible enough to encapsulate large or small parts of code.

"There's a lot of overhead around threading that means you want to break your program into as few pieces as possible. With Grand Central Dispatch, we say break your program into as many tiny pieces as you can conceive of," Hodges said.

Another difference with the Grand Central Dispatch approach is its centralization. The operating system worries about managing all applications' blocks rather than each application providing its own oversight. That central view means the operating system decides which tasks get which resources, Apple said, and that the system overall can become more responsive even when it's busy.

Other foundations
There's a second mechanism in Snow Leopard that gives a new way for programmers tap into hardware power: OpenCL, or Open Computing Language. It lets computers use graphics chips not just to accelerate graphics but also some ordinary computations.

To use OpenCL, programmers write modules of code in a variation of the C programming language called OpenCL C. Snow Leopard translates that code on the fly into instructions the graphics chip can understand and transfers necessary data into the graphics system memory. Many tasks won't benefit, but OpenCL is good for videogame physics simulation or artificial intelligence algorithms, technical computing chores, and multimedia operations.

The three major makers of graphics chips--Intel, Nvidia, and AMD's ATI--have endorsed OpenCL, and the Khronos Group has made it a standard. That means programmers are likely to be able to reuse their OpenCL code with Windows applications, too.

Graphics processors employ parallel engines that suit them for running the same processing chore on many data elements. For computers without a graphics chip, though, OpenCL also can employ that parallel execution strategy on ordinary multicore processors.

The 64-bit transition
Apple began its 64-bit transition years ago with the PowerPC processors it used before switching to Intel chips. With Snow Leopard, nearly the full suite of its software--Mail, Safari, Finder, iChat, iPhoto--become 64-bit programs.

Intel chips these days are 64-bit, but what does that get you over 32-bit chips? Briefly, it can let heavy-duty programs use more than 4GB of memory, improve performance by offering more chip memory slots called registers, and speed up some mathematical operations.

Moving to a 64-bit design doesn't guarantee instant speedup, though. In one developer document, Apple states: "Myth: My application will run much faster if it is a 'native' 64-bit application. Fact: Some 64-bit executables may run more slowly on 64-bit Intel and PowerPC architectures." One issue: the doubled length of references to memory addresses.

Apple encourages programmers to test their software to see if the 64-bit incarnation is faster. All Apple's own applications that moved to 64-bit versions are faster, the company said.

The 32-bit kernel
However, the core component of Mac OS X, the kernel, is still 32-bit software by default on consumer machines such as MacBooks and iMacs. Apple has written it so that applications can handle more than 4GB of memory, though, and the kernel can manage it all.

In its developer document on 64-bit performance, Apple states: "Myth: The kernel needs to be 64-bit in order to be fully optimized for 64-bit processors. Fact: The kernel does not generally need to directly address more than 4 GB of RAM at once."

Apple's 32-bit kernel hits limits with very large amounts of memory, though. "Thus, beginning in Snow Leopard, the kernel is moving to a 64-bit executable on hardware that supports such large memory configurations," its Xserve server line and Mac Pro workstations, the company said.

The tricky aspect of moving from a 32-bit kernel to 64-bit kernel is that drivers--software that let the operating system communicate with devices such as printers, video cards, and hard drives--must also be 64-bit. That's not so bad when it's a hardware device under Apple's control, but it's harder to move the full collection of third-party devices with their own drivers.

Apple argues it's not hard to make the jump, though. "As a driver developer, you must update your drivers with 64-bit binaries. Fortunately...many drivers 'just work' after changing the compile settings," the company said in a reference document.

This all may sound very low-level, but for programmers, Apple actually is working at a higher level than most. That could be an asset since many attempts to embrace parallel programming imposed more demands than most programmers were willing or able to handle.

And attracting programmers is key. Ultimately, Apple's deeper technology moves such as Grand Central Dispatch and OpenCL will be a success only if the company can get other developers to use them.

This article was originally posted on CNET News.

Topics: Apple, Hardware, Processors

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.

Talkback

18 comments
Log in or register to join the discussion
  • You do realize that...

    You do realize that GCD is just a combination of 2 technologies that already existed previously on other platforms, right?

    a) Apple finally added "closures", aka "anonymous methods" or "lambda expressions" to Objective-C. Other languages have had this for many many years, and it is what you need to create the "blocks" you are talking about.

    b) Apple added APIs that can take these "blocks" and schedule them to be executed on different threads/cores. There are many different libraries available on other platforms that have done the same for years.

    The only slight advantage Apple has is that these APIs are now provided by the OS, while in other cases they had to be used via add-on libraries. But they have no real technical advantage. Other libraries are also globally (hence system-wide) aware of other threads as well as all hardware resources, so there really is no technical advantage in Apple's case.

    So I would hardly write a whole blog about how Apple is blazing a new trail when in fact all they are doing is catching up.
    Qbt
    • Thank you

      Was going to post the same thing, but you've already done so with a good deal more technical insight then I would have been able to give.

      Here's a question for Apple though. They've been selling 8 core desktops for years now. Are they now saying that, until SL, those have been a waste of money for folk that shelled out for those 8 cores? Cause that's what I'm hearing.

      Being sarcastic of course, as its been known for sometime that 8 cores is ridiculus overkill for anything outside a server environment. Even 4 cores have only over the last year proven themselves to be useful as Windows apps take advantage of aforementioned tech that Apple is claiming to be trailblazing with. It's nice to see Apple admit that they've been overcharging consumers - even if indirectly.

      "The views expressed here are mine and do not reflect the official opinion of my employer or the organization through which the Internet was accessed."
      gnesterenko
      • Multiple cores

        Well, you have *always* been able to do this kind of stuff in most programming languages for quite some time.

        They basically just grabbed something normally done by a compiler and handed it to the OS.

        If you have an application compiled to use multiple threads - then it'll take advantage of multiple cores, with or without the help of the OS.

        "as its been known for sometime that 8 cores is ridiculus overkill for anything outside a server environment. "

        Yes and no. I can think of a few markets where multiple cores helps outside the server environment:

        -Gaming. Many games are moving to support multiple cores. Generally used for the AI and physics.

        -Video and audio processing.

        -Raytracing.

        -High resolution photo editing.

        -Interestingly enough, Google's Chrome browser supports multiple cores: Each tab is in its own process, and the OS can move processes into separate cores.

        Buy you're kinda right: These are, after all, specialized markets.

        And Chrome won't benefit too much because it's not a CPU intensive application to begin with. The separation into processes was more for stability than to equalize any CPU load. In that respect, you'll get the benefits of Chrome's threading even on a single core system.

        General purpose computing pretty much stopped benefiting from CPU increases a long time ago. I'd venture to say that once we hit the 1 GHz mark on single core systems we hit the point where even apps considered "bloated" like Office didn't benefit from further speed increases.

        Dual cores added some benefit by making entire-OS slowdowns much harder. No longer can a poorly designed app kill a computer by pegging CPU usage at 100%.

        But much beyond that - yeah, for general usage, it's overkill. Most people would be fine with any of the Core 2 processors.

        The biggest market for non-server use is likely the gamers - games are constantly pushing the boundaries of power, and for the best experience in the top games, you'll need all of the power you can get.
        CobraA1
        • core != process

          be careful equating support for multiple processes with support for multiple cores. Support for multiple processes is common in Unix and NT-based Windows. Efficient support for multiple cores is much harder to do and is still an area of ongoing work - it still bothers me that Outlook can cause my PC to become unresponsive or that one tab on Safari can block the others on OS/X 10.5, for example. These are examples of processes that can 'lock' processes running on other cores. To see what needs to be done to resolve this, look at the evolution of FreeBSD from about version 4 to version 7 - it is a lot of work.
          shis-ka-bob
    • SL is a small but impressive update

      "The only slight advantage Apple has is that these APIs are now
      provided by the OS, while in other cases they had to be used via add-
      on libraries. "

      Only a slight advantage;-)

      "But they have no real technical advantage."

      No advantage running these lightweight threads out of userspace;-)

      "So I would hardly write a whole blog about how Apple is blazing a
      new trail when in fact all they are doing is catching up."

      Catching up with whom? What other OS ships with integrated task
      parallelism, open GPU co-processing subsystem, and 64-bit
      userspace supporting 32 and 64-bit kernels?
      Richard Flude
      • Really?

        [i]"The only slight advantage Apple has is that these APIs are now provided by the OS, while in other cases they had to be used via add-on libraries. "

        Only a slight advantage;-)[/i]

        Yes, the only slight advantage is that the APIs are part of the OS, instead of having to use an additional library (which is trivial). There are no technical advantage, though (and I am purely talking about GCD here, not any other SL feature).

        [i]No advantage running these lightweight threads out of userspace;-)[/i]

        Can you explain what is better with these threads than with any other Windows thread running in user mode, including the tread-pools that have been available on Windows for years?

        [i]Catching up with whom? What other OS ships with integrated task parallelism[/i]

        Yea, as I said, that is the [b]only[/b] advantage Apple has, in that it added this functionality directly into the OS, which has been the same thing you could do on many other platforms for years with additional libraries. The rest of the world welcomes Apple into the 2000's!

        [i]open GPU co-processing subsystem[/i]

        Funny you should mention "open" while trying to sell Apple. LOL. GPU processing is nothing new either, so once again Apple saw what others are doing, did their own "open" implementation, and added a shiny name to it to confuse people into thinking they are on the cutting edge.

        [i]and 64-bit userspace supporting 32 and 64-bit kernels[/i]

        Yea, you have to pick one or the other. They can't just [b]get it right[/b] and go with 64-bit, now can they? Why not? They have a tiny amount of hardware to get the 64-bit drivers working for, most of which they control themselves. Yet they can't even get that sorted out. Instead people like you pat them on the back because now you have to deal with this situation where end-users need to pick one or the other. I must hand it to Apple - they can turn any failure into something that their followers will think is a huge innovation. You Apple fanboys are a laugh-a-minute...
        Qbt
        • Yes

          "Can you explain what is better with these threads than with any other
          Windows thread running in user mode, including the tread-pools that
          have been available on Windows for years?"

          Their simplicity of use, their lower overhead and (non-library) the
          system wide pool.

          "...which has been the same thing you could do on many other
          platforms for years with additional libraries. The rest of the world
          welcomes Apple into the 2000's!"

          Wow, these libraries weren't available for OS X?

          "Yea, you have to pick one or the other. They can't just get it right and
          go with 64-bit, now can they? Why not? "

          Why experience the 64-bit Vista nightmare at introduction when all
          can be avoided. Apple gets the benefit of 64-bit applications with the
          compatibility of 32-bit kernel during transition. Just what do you
          believe is the significant benefit offered with a 64-bit kernel?

          "They have a tiny amount of hardware to get the 64-bit drivers
          working for, most of which they control themselves. "

          They control the hardware in their systems, but don't write most of
          these drivers as they're sourced from other parties.

          "Instead people like you pat them on the back because now you have
          to deal with this situation where end-users need to pick one or the
          other."

          Users don't have to do anything.

          "...Apple - they can turn any failure..."

          Now it's a failure to seamlessly migrate to 64-bit?

          "You Apple fanboys are a laugh-a-minute..."

          The MSCE never fails to impressive me with his ignorance of
          technologies and other platforms.
          Richard Flude
          • That's the best you can come up with?

            [i]"Can you explain what is better with these threads than with any other Windows thread running in user mode, including the tread-pools that have been available on Windows for years?"[/i]

            [i]Their simplicity of use[/i]

            Any proof of this? No, Apple's PR page doesn't count as proof.

            [i]their lower overhead[/i]

            Any proof of this? No, Apple's PR page doesn't count as proof.

            [i]and (non-library) the system wide pool.[/i]

            Are you really under the impression that thread pools under Windows all just do their own thing, and that the OS scheduler doesn't schedule them properly?

            [i]Wow, these libraries weren't available for OS X?[/i]

            Can you explain to me how you were previously going to use closures in Objective-C in order to schedule these "blocks" of nested code? In C# and many other languages you have been able to do this for years already. Objective-C has only now been updated to support this. You have to accept the fact that Apple is playing catch up with Objective-C. I realize it is hard to accept as an Apple fanboy but you'll need to try.

            As far as your 64-bit comments go, didn't Apple tell us years ago already that they had the first "64-bit workstation"? (Not) And today they [b]still[/b] don't have a true 64-bit OS. Well, unless you happen to remember to hold down the "4" and "6" keys... What a joke.

            I would suggest that you don't get all of your info from Apple's PR pages and sites like www.MacsAreTehBestest.com
            Qbt
          • And yet you come up with nothing

            "Any proof of this?"

            Perhaps you'd like to support any of you claims;-)

            "simplicity": #include <dispatch/dispatch.h>, system wide facility
            "overhead": GCD queues overhead is 256 bytes
            "system wide pool": GDC has a global pool, windows application
            pools. Leaving out thread pools, this is vastly superior to thread
            programming (which is also available). Pervasive tech in OS X, going
            forward even greater pool efficiencies.

            "Can you explain to me how you were previously going to use closures
            in Objective-C..."

            Mac OS X programming isn't restricted to Objective-C. The same
            languages and libraries you're promoting are available on Mac - with
            application pools;-)

            "You have to accept the fact that Apple is playing catch up with
            Objective-C."

            Right, however this was never your claim.

            "And today they still don't have a true 64-bit OS."

            Actually they do, including a 64-bit kernel for restricted systems.

            Again what is it you believe Apple users are missing? 32-bit kernel
            users, unlike windows, benefit from the greater performance and
            memory in 64-bit userland whilst maintaining compatibility.

            Apple introduced 64-bit computing for the PPC. Continued the
            transition with Tiger, then Leopard. All has been seamless to the user.
            They continue again with SL. All the benefits, none of the setbacks
            with the windows migration. And the MSCE continues to bleat.

            Fact remains OS X is the only OS to include the technologies promoted
            in the article. Only the deluded could believe this is playing catch-up.
            Richard Flude
    • closures != anonymous methods

      Anonymous methods are syntactic sugar. Closures are more, see http://martinfowler.com/bliki/Closure.html
      shis-ka-bob
      • Not so fast...

        The article you link to got it wrong. The C# "anonymous methods" do indeed capture the surrounding variables, and behave very much like closures. Not just that, but unlike Java, you can also modify those captured variables. I think I would know since I have been doing this very same thing for a few years now. It comes down to semantics, really. See this:

        http://www.theserverside.net/tt/articles/showarticle.tss?id=AnonymousMethods

        We can go into minute details here (which ultimately don't add up to much), but the point is that Apple is clearly playing catch up here and not inventing something revolutionary (at least as far as GCD is concerned).

        For instance, look at Richard Flude's post above: He is desperately flailing around trying to keep the illusion that Apple did something innovative. This is typical of people that know a little bit of technical details but don't fully understand the concepts in depth. A perfect candidate to fall for Apple's "shiny new name" attempt at fooling people.
        Qbt
  • It's always awesome

    to see the Anti-Apple Trolls at work to get the firsts posts to an Apple article. :)
    Michael Alan Goff
    • Yep, first at many things

      First at x64. First at multi-core support. and now first at ZDnet posts!

      "The views expressed here are mine and do not reflect the official opinion of my employer or the organization through which the Internet was accessed."
      gnesterenko
    • Yes, because the typical response

      ...from the Apple fanboys would be to tell us how Apple is on the cutting edge because of GCD, when in fact all Apple did was to finally add support for concepts that have been actively used on other platforms and in other languages for years.

      Oh, and to give it a shiny name to confuse bloggers into thinking Apple is on the leading edge.

      But Apple is actually correct, this is revolutionary. On the Mac. When you look at the big picture, they are simply catching up.
      Qbt
      • Just like the iPod caught up to MP3 players already

        out there eh? Yeah thats the ticket. Oh by the way Richard Flute seems
        to have blank slapped you all over this post what's up with that?

        Pagan jim
        James Quinn
  • Some thoughts

    "Now the company has begun talking more publicly about it and other deeper projects to take advantage of graphics chips and Intel's 64-bit processors."

    That's good to know.

    Now all they need to do is to convince more game devs to develop for their platform.

    "If building mature parallel programming tools is a 10-chapter book, the industry is only at chapter two right now"

    More like chapter 1. We still need to get a lot of the theory fleshed out and look at how we write languages. There's a whole world of message based programming languages playing around with ways of doing stuff in asynchronous ways without requiring locks, and it's still barely a blip on the radar.
    CobraA1
    • And all this explains why...

      Parallelism is something Microsoft and others are trying to work out, and to convince CS school to work out for them.

      As you say, barely a blip on the radar in terms of major advance. Mayhaps it all might be a spur to large scale research in colleges and a new rise of CS as a major of import.
      zkiwi
  • Finally an informed article re SL on ZDNet

    Even if it had to come from another publication.

    SL includes some nice new technologies, noticeable speed
    boast for Intel Macs and costs very little.

    The bleating about this release of ZDNet has been nothing
    short of amazing.
    Richard Flude