Tech

Harnessing a multicore future

Last week's PDC was about more than just Microsoft's new cloud initiative (Windows Azure) and the next version of Windows (Windows 7). It also concentrated on what Microsoft is doing to assist programmers to tackle the big shift to multi-core computers. Harnessing the power of multi-threaded programming will be critical to achieving higher software performance in future.

Written by John Carroll, Contributor Nov. 3, 2008 at 2:32 a.m. PT

There were a lot of things discussed at the PDC, and if I had the time, I probably could have written several articles each day on various subjects. If I had done that, however, I wouldn't have had time to attend all the sessions which are such wonderful sources of information, hence cutting down on my ability to write articles about the PDC in the first place. This is why I think Microsoft needs to accelerate that cloning machine, so I can attend IT conferences, work on code related to my business, and spend a few months in Tahiti all at the same time.

So, this is a bit of the overflow from last week's PDC, and I may have a bit more later this week, depending on what happens in the world of Information Technology.

Microsoft has been pouring a lot of resources into figuring out ways to help developers work around Moore's "wall." That's a variation on Moore's law, which postulated that the density of transistors on a processor would double every two years. This postulate proved correct, and the result was that CPU speeds doubled as the number of processors stored on them doubled, which made life very easy for programmers.

From a programming standpoint, "single threaded" programs are the easiest to program. In single-threaded programs, one operation in code happens sequentially after another operation. Most user interfaces are single-threaded (though frameworks such as WPF do a good job of offloading heavy-duty tasks to other processors, such as the specialized one on the graphics card), and many standard programming concepts assume single threaded processing. Most language-level looping constructs (bits of code that cause other pieces to get repeated in a certain controlled fashion) assume single threaded handling.

With Moore's law, programmers could write those "easy" programs, and count on the fact that ever-doubling processor densities would make their program run faster, for free. Some presenters at the PDC called this the hardware "free lunch," as programmers didn't have to do anything particular to benefit from it.

Moore's law, however, has been running into the limits of quantum physics. Transistors can only get so small. In theory, the insurmountable limit to transistor size is a handful of atoms (which is quite a bit smaller than current transistors), but realistically, quantum effects place the limit at much larger scales. These effects make it very hard to shield transistors from electrical activity in other nearby components, and heat starts to become a real problem (as should be clear to anyone who has seen the monster heat dissipation sinks found in modern desktop computers).

There is some debate as to whether Moore's law is truly at an end. Something new might be discovered which could turbocharge Moore's law using exotic new materials, or new ways could be found that make better use of silicon. For now, however, it's pretty clear that the activity predicted by Moore's law has slowed, and in response, chip manufacturers are adding more and more cores.

Unfortunately, that doesn't improve software speeds unless a program has been designed to take advantage of those multiple cores. Programs must be designed to enable separate activities to be run in parallel, and this creates a great deal of complexity. When you have multiple threads of control, you have to worry about things like synchronized access to data. You don't want one "thread" of a program modifying data while another is using that data to make calculations. Imagine an accounting spreadsheet that had one thread updating line items while another used those same values to create a total. The total might end up matching neither the sum of values which existed at the start of the run, nor the sum of values as exists when the data modification thread is complete.

This extra complexity leads to more bugs, and higher costs. Worse, it is hard for even experienced programmers to make bug-free multithreaded programs. Humans may be massively multithreaded creatures (our ability to read text as fast as we do is a result of massively parallel processing taking place in our brains), but we aren't particularly aware of it, and modeling it in code is highly error-prone.

Since the reality is that we will have multiple cores in new computers (hundreds of cores may be typical in standard desktop computers before too long), the need for constructs that make it easier for developers to tackle these difficult computing problems is acute. As the PDC made clear, however, Microsoft has been spending a lot of resources on creating these constructs.

Removing the "how" something is done from the "what" it does (which is a simple way of describing the difference between imperative versus declarative coding constructs) is a critical part of making a program work well in multicore settings. Constructs to assist in this process already exist. Windows Workflow Foundation (WF) is a framework for assembling discrete tasks without specfying too carefully exactly how it will be processed. The important role of WF in the .NET product landscape became more apparent over the course of the PDC, as it is the API that trains developers to think in terms of discrete packets of code logic that can be assembled, and potentially, processed in parallel (subject to "rules" included as part of the "flow," but this is not a post about Windows Workflow, though a book on the subject which I bought at the PDC is proving rather interesting).

Language Integrated Query (LINQ), Microsoft's new data access framework that has direct support in C#, is also an environment that takes a declarative approach to data access without specifying precisely how the request will be handled. Because of this declarative approach, making a LINQ request a PLINQ request (the "P" stands for "Parallel") is as easy as adding an .AsParallel to the source object in a LINQ query, as follows:

var q = from c in customers.AsParallel()
where c.Country = "US"
order by c.Name
select c;

Note that this would be an ordinary LINQ query if the AsParallel() suffix had not been added to customers. The declarative nature of the base LINQ syntax, in other words, has made it very easy to "parallelize."

Newer, more explicit, constructs will arrive as part of .NET 4.0. New class constructs, such as the new "Task" library, make it easier to parcel your code into discrete packets, leaving the question of "how" it is parcelled off across cores to the framework. In the Task library's case, it ensures that there are never more threads handling your code than there are cores on the system. This avoids task switching, a process by which data related to your thread is saved away from registers in a CPU core so that another thread can use that core, a process that is necessary if there are more threads than cores, but can make your new multithreaded program run slower than its single threaded predecessor if you aren't careful. ThreadPool (a class I use almost to the complete exclusion of Thread) has been improved in ways I am still trying to grok (the local queue concept is interesting, though I'm not sure if I understand WHY it improves performance; on that note, that's why I use frameworks), and new constructs like Parallel.For, Parallel.Foreach and Parallel.Invoke have been created to make looping constructs into blocks of code that can be processed in parallel.

If you are a .NET developer, and in particular, if you are a server-side .NET developer (where large numbers of cores are far more common today), understanding these constructs is likely to become essential. The site Microsoft has dedicated to "Parallel Computing" can be found at http://msdn.com/concurrency. Besides getting up to speed on Windows Workflow Foundation (an area of .NET 3.0+ that I have not spent much time learning), I plan to spend some time exploring the site. I'll write about anything that springs to mind.

Editorial standards

Show Comments

Harnessing a multicore future

Related

I've tried a zillion desktop distros - it doesn't get any better than Linux Mint 22

One of the best foldable phones I've tested is not from OnePlus or Motorola

One of the best budget Android tablets I've tested is not made by Samsung or Google