For the last couple of days I've been working behind the scenes with fellow blogger Adrian Kingsley-Hughes, Ed Bott, and Microsoft to get to the bottom of the Vista gigabit throttling effect. This throttling effect occurs when someone launches Windows Media Player and tries to copy files at gigabit speeds. Ed had already deduced the problem to the MMCSS (Multimedia Class Scheduler Service) which is a mechanism designed to protect audio and video from jerking due to CPU starvation if processor intensive tasks like Gigabit network traffic or anti-virus kicks in.
Microsoft responded last Friday that this throttling effect was in-fact "by design" prompting Adrian to post this blog but the response from the ZDNet readers and Slashdot community was not kind to Microsoft. This issue has already fanned the flames of Vista DRM conspiracy theories like the ones peddled by Peter Gutmann. Then the initial Microsoft response further fueled those conspiracy theories and that the inclusion of DRM made it necessary to implement the performance throttling in the first place. The responses in the forums to this "design" were unusually brutal even for Microsoft.
UPDATE 8/29/2007 - My assessment of Microsoft's initial response is based on my perception that it was inadequate but I may have misrepresented their initial response. Microsoft did say in their initial response that they were thinking about how they were going to address the problem. Their exact words were "Of course, we are already thinking about how we can address this problem, but we are not at a point where we can discuss when that will be available or what form it will take." I personally thought anything short of explicitly calling it a bug and promising a fix was inadequate and still do, but I should have noted that they were looking at a solution.
Microsoft Fellow Mark Russinovich yesterday morning came out with a detailed technical explanation of the throttling effect and gave candid explanation that this was in fact a bug. Russinovich explained how the hard-coding of network performance rate-limiters was short-sighted and that it didn't make any allowances for the fact that CPUs are now much faster and have multiple cores. In Russinovich's own words, "the networking team is actively working with the MMCSS team on a fix that allows for not so dramatically penalizing network traffic, while still delivering a glitch-resistant experience".
Note: This is the kind of honest response that Vista customers or potential customers want to hear and it wouldn't have been greeted in the forums with jokes that Vista is "broken by design". People are fairly forgiving if a Company just owns up to the problem and promises to fix the problem and this is a lesson that every Company should learn. The problem really affected a minority of people a minority of the time but consumers want a product that works all the time in all situations no matter how unlikely they're going to be affected. Just a simple and honest acknowledgement of the problem and a promise to fix it in a reasonable amount of time goes a long way.
Russinovich explained that network performance was hard-coded to cap at 10,000 packets per second if any MMCSS-enabled application came on (apparently even if no audio or video is being played) and demanded CPU priority. The hard-coded rate limit essentially limits network performance to around 15 MB/sec (megabytes per second) because 1500 bytes per packet times 10,000 packets equals 15 million bytes per second. 15 MB/sec works out to be 120 mbps (megabits per second) and most 10/100 networks top out around 90 mbps while most broadband connections are capped to 1.5 mbps. Even most so-called gigabit NAS (Network Attached Storage) devices have a performance cap of around 120 mbps so very few people will even notice the MMCSS induced throttling in the first place.
Gigabit throttling in effectThe problem of gigabit packet rate throttling becomes noticeable when people start transferring files from gigabit-capable computers connected by gigabit Ethernet switches both of which are very cheap these days. Most modern motherboards come with gigabit Ethernet adapters built-in and I bought a 5-port jumbo-frame capable 3COM gigabit switch for $36 online. In this use case, users can expect un-throttled speeds of 500 mbps going from a PC hard-drive to another PC's hard drive using Windows SMB file sharing over the gigabit network.
Note: The gigabit file transfer would even go higher if the typical 7200 RPM desktop hard drives could go faster than 500 mbps.
Once the 10K packets per second MMCSS throttling is in effect, the user sees performance drop from around 60 MB/sec to about 15 MB/sec which you can see below.
MMCSS throttling in effect while Windows Media Player plays music:
No MMCSS throttling in effect:
As you can see, the CPU utilization isn't really that bad in either case on my dual-core Intel E6400 2.13 GHz processor. Microsoft's Vista team set the hard cap of 10K packets per second with a slower single core processor in mind when they should have made it dynamically adjust performance based on the processor. My processor could have easily been dynamically throttled to 50K packets per second and it would have achieved the same desired effect because my CPU is probably five times faster than the computer Microsoft was designing for. Furthermore, the amount of throttling should dynamically adjust for idle, playing music, playing a DVD, or playing a high-definition HD DVD or Blu-ray disk. Putting in an aggressive hard-coded limit based on the worst-case scenario is just wrong.
Jumbo frames to the rescueHowever, I've found a fairly reasonable workaround using Jumbo frames which is generally a good idea to run if you're using gigabit Ethernet since it offers better performance and lower CPU consumption. With the properly updated gigabit network adapter drivers for each gigabit-enabled PC and a jumbo-frame capable gigabit Ethernet switch, we can make the 10K packets per second limit less of an issue.
Note: The cheap 3COM 5-port gigabit switch supports 4K (4096 byte) jumbo frames while some gigabit switches may support up to 9K (9216 byte) jumbo frames. This limit wasn't a problem for me since a lot of gigabit network adapters have 4K limits and it's the lowest common denominator that sets the limit. Also note that an Ethernet frame is a layer-3 packet encapsulated in layer-2 Ethernet headers.
By using 4K jumbo frames, 10K packets per second times 4K packets gives you approximately 40 MB/sec which is much better than the old 15 MB/sec cap. As you can see below, this works like a charm.
MMCSS throttling less of an issue with 4K jumbo frames:
No throttling with 4K jumbo frames:
Ironically, the throttled performance seemed to be less erratic even though it was capped at 41 MB/sec whereas the un-throttled performance was 67 MB/sec. This erratic behavior may be due to the Realtek Ethernet drivers which were only recently fixed from producing silent data corruption but I'm still waiting for confirmation on that.
Gigabit send seems to be affected tooI was told by Microsoft that only receive transmissions were packet rate throttled and that send transmissions weren't but may be susceptible to lowered prioritization. I noticed that jumbo frame sends weren't penalized by MMCSS but no jumbo frame send performance was nearly halved.
MMCSS deactivated with no media playing:
MMCSS prioritization in effect with music playing from WMP11:
This data would almost seem like there is some send packet rate throttling as well though not quite as severe as receive side packet rate limiting. I'm going to have to see if Microsoft has more on this but the use of jumbo frames alleviates send performance as well. [UPDATE 2:40PM - Microsoft explained that even though there is no explicit send rate limiting, the acknowlegements would be implicitly capped which explains the send throttling.]
MMCSS prioritization doesn't affect jumbo frame sends: