AMD aims to simplify GPGPU programming with HSA and hUMA

Summary: Combining the power of the CPU and GPU offers GPGPU solutions, but these are cumbersome for developers, since the CPU and GPU use different memory pools. AMD plans to eliminate this burden by using another acronym: hUMA.

Chip maker AMD is looking to aid those wishing to make use of GPGPU by throwing more acronyms at them in the forms of HSA and hUMA.

While CPUs are great when it comes to processing single-threaded code with branches, they're not so good when it comes to parallel operations. However, it just so happens that GPUs are great at crunching through parallel operations, but weak when it comes to processing single-threaded code. This has given rise to the general purpose GPU (GPGPU) that has been designed to offer the best of both worlds.

While GPGPU offers the best of both worlds in terms of processing power, it does have a drawback – it's not easy to leverage. Specifically, addressing memory is cumbersome, because while the CPU and GPU may share the same physical memory chips, they have their own pools of memory. This means that data has to be copied back and forth between the CPU and GPU, which is not only wasteful in terms of processing power, but also a massive code overhead.

AMD wants to eliminate this burden with a new system architecture called Heterogeneous Systems Architecture (HSA), and at the core of that is "heterogeneous Uniform Memory Access", also known as hUMA (as if we didn't have enough acronyms already).

(Image: AMD)

Boiled down to its simplest terms, hUMA allows both the CPU and GPU to share the same chunk of memory, and this in turn makes the hardware simpler, which makes it easier for developers to leverage GPGPU.

The first AMD hardware to support hUMA will be the upcoming Kaveri APUs. These will feature Steamroller processing cores, and are expected to make an appearance during the second half of 2013.

(Image: AMD)
(Image: AMD)

Even better for developers is the news that hUMA will be supported by mainstream programming languages such as C++ and Java.

hUMA is expected to find its way into a broad range of hardware, from servers to games console. In fact, an interview with the PlayStation 4 lead architect Mark Cerny, he suggested that Sony's upcoming console may make use of this technology.

(Image: AMD)

Topic: Processors

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.

Talkback

7 comments
Log in or register to join the discussion
  • The memory for GPUs is different . . .

    The memory for GPUs is different . . .

    Generally speaking, graphics cards use dedicated memory for two reasons:

    1) To take the load off of main memory. When using a GPU intensive app, it may use a lot of memory to store all of the vertices and textures. With a very realistic game, that means there's little room in memory for much else.

    2) Memory on graphics cards is generally GDDR, not standard DDR. GDDR is generally more expensive, but faster, as it needs to keep up with realtime graphics involving millions of polygons and large textures.


    Normally for graphics, having separate memory isn't a problem: The next step for graphics after the GPU is done with them is generally the monitor, not the CPU. The CPU generally cares very little about what's done after it hands the information to the GPU.

    However, there are some cases where things may be handed back to the CPU: Physics and scientific computing come to mind. In these cases, shared memory may make some sense.

    To me, "hUMA" sounds like a variation of shared memory - not necessarily anything new. Intel has been doing this for years. And to be honest, Intel has not impressed anybody with its results - it constantly underperforms even the slowest dedicated video cards.


    "In fact, an interview with the PlayStation 4 lead architect Mark Cerny suggests that Sony's upcoming console may make use of this technology."

    What's interesting is that the PlayStation 4 is not using standard DDR, as Intel generally does with its dedicated graphics. The PlayStation 4 is using GDDR, and such a thing has never, to my knowledge, been done with a CPU.

    This IMO makes the Playstation 4's performance a big wildcard. Nobody knows how this will turn out. It's never been tried before.

    It could go one of two ways:

    1) The GDDR boost makes it a screaming fast machine, and makes it a great gaming console.

    2) The fact that it's a type of shared memory inherits the drawbacks of shared memory, and it underperforms.

    Which way it'll go, nobody can tell you: This is literally something that hasn't been tried before. I hope it succeeds, but only time will tell.
    CobraA1
    • Re: nobody can tell you

      You might not, but some of us do have a clue. The thing will scream.

      Unifying CPU and GPU memory access and cache coherency is good idea. Already done in the ARMs big.LITTLE architecture and planned for Intel CPUs. The current idea how to improve performance.

      The next step will be external GPU interconnect different that PCI-Express, for example something like HyperTransport. Not trivial, but doable. Technologies like hUMA will only make this task more challenging.
      danbi
      • I'd need to see a benchmark.

        "You might not, but some of us do have a clue."

        Doubtful.

        I'd need to see a benchmark.

        "Unifying CPU and GPU memory access and cache coherency is good idea."

        It's a good idea for many applications. But being that memory is now shared, things are competing for it. Intel does something similar with integrated graphics (uses system memory for graphics), and nobody has been impressed with the performance.

        Hopefully the hUMA way is better than the integrated graphics way.

        "The next step will be external GPU interconnect different that PCI-Express, for example something like HyperTransport."

        Dunno what people have against PCI-E. v4.0 will actually be faster than HyperTransport.

        I've never had issues with bus speed being a bottleneck when gaming. I imagine for scientific research it could be an issue, but not gaming.
        CobraA1
    • Re: The memory for GPUs is different . . .

      That's less of an issue. What's different here is the unified virtual address space between CPU and GPU, so code running on both can exchange pointer values and have them point to the same thing. That should simplify programming enormously.

      With that complexity problem out of the way, THEN we can deal with the performance issues of working with different kinds of memory.
      ldo17
  • New vector for attack

    Does this mean that we (but not me, I don't program drivers!) will need to make sure our video drivers are extra-hardened, as the GPU will have direct access to the same memory as the CPU?
    x I'm tc
    • Re: New vector for attack

      Don't worry. Nothing new will happen. For ages, any peripheral that can do DMA had unrestricted access to the "same" memory as the CPU. Did you trust those drivers? A driver has unrestricted access to the CPU nevertheless, especially in Windows --- so even a keyboard driver can do bad things.
      danbi
  • Education To Help Those...

    ...with no sense of hUMA.
    ldo17