X
Business

Q&A: Windows Server 2003 kernel guru

Windows core technology guru Rob Short explains how hackers were involved in Windows Server 2003 development, and why not all NT4 applications will run on it
Written by Rupert Goodwins, Contributor
At the Microsoft Server 2003 launch in London this week, ZDNet interviewed Rob Short, the vice-president of Windows Core Technology. Responsible for the overall engineering and management of the Windows kernel, Short talked about what makes Server 2003 different from previous Windows products, where Unix and Linux still have the advantage, hackers, application compatibility, performance and security. ZDNet UK: Is it fair to say Windows Server 2003 is just XP with the .NET storage extensions bolted on?
Short: No, that's not accurate. It's true that the core of windows is the same, many parts of the system are very similar across the two products. But a year and a half, two years ago we were looking at the constant problems we were having with security and hacks. The level of maliciousness of the hacks was getting frightening. We stopped all other work. We got the architecture people to look at each part of the code, and work out how would people attack it, and based on that tried to reduce the surface area, what's the exposed part of the product, the ways the system was listening to the network. That was the top priority, especially as we were creating new things. We spent a lot of time understanding how IIS (Internet Information Services) was managed, and there are a lot fewer ways to do that now. Each of the new components have well defined threat models analysed by security experts. The older ones have a lot turned off by default so that administrators are aware of what's running in the system. And then if we move down a level, at the same time that we were doing the architecture review we took eight or ten of our best coding people and sent them off to go and be hackers. One person I have working for me actually used to be a hacker -- he's British -- and we persuaded him there was a career to be had. We took a whole bunch of these people and made them hackers. We had them hack the system. We took the people who were responsible for each component and we did design reviews and code reviews. We created a whole book of common coding problems that lead to security errors, and we took every piece of code in the system and compared them against those rules. We created tools that run across the code and understand almost all the attacks. Microsoft Research built a tool that can find almost all the buffer overflow problems, and compilers added a bunch of checking. So we've done stuff right across everything. At the very top level it's the same -- the administrator of the system controls the passwords, what accounts are available and so on. The more locks you put on something, the harder it is to use, the more inclined someone is to leave it unlocked. You have to watch the balance between keeping it very tightly locked down and -- will people use it? But we took every single person who worked on the product, development and management teams, and had them look at the security from top to bottom. We're still finding issues. But all of the newer code has got to be ten or a hundred times better. How do you see the patch rate changing?
Right now the patch rate is still high. We're doing a number of things. We're looking at the patches. A lot of times we look at an attack and we look at all the rest of the code across the system to see if the attack applies elsewhere. We've built a patch mechanism in 2003 that will be shipped externally. We'll be able to patch probably two thirds of the components without shutting the system down. That's an area where the Unix guys are ahead of us, because of the way they do redirection -- they can patch a file and then change the symbolic link. That's an area where we've got a problem, and we'll fix it in the near future when possible. How many applications will transfer over from NT4 or 2000?
We had a very high goal, but what happened to the goal was that we ran into security problems. We added a lot of changes in the system so that the applications couldn't interfere with each other or the operating system. I'm not sure what the exact number is for taking an NT4 application and running it -- it's in the high 60 percent. It's not 90. The ones that people make themselves tend to be better than the larger, all-encompassing applications. We've tested literally thousands of applications. There's an enormous list you can look at to see what on your particular application you might have to change. Most of the problems we've seen have been security related. There are some issues with the IIS redesign, but most of the time, if the application is following the rules then it will run. But I must admit the rules haven't been well publicised. You pushed some of the IIS into the kernel, didn't you?
We have what we call a listener, an HTTP handler that we pushed into the kernel. We were looking at how to improve performance. Requests come in and go all the way through the networking and back into user mode where they're handed off. There is a huge amount of the web traffic that you can respond to very quickly without having to have a user mode. So there's HTTP.SYS, a driver that runs in kernel mode and responds in ways that are very well understood, with some parsing and quite a bit of caching, and it handles sessions and it's a huge performance win. Personally, I'm against shoving things into the kernel. That was a very careful decision. We have a lot of parsing in there, and that opens you up to buffer overruns and attacks. The amount of scrutiny that code has got is just plain ugly. Anything that gets it confused gets shoved straight back up. What's happened to the file system?
There are two things. We spent a lot of time on performance. We created the SMB file server specs, and we didn't have the fastest one around, which was embarrassing. So we took our performance team and said "your mission is to make ours twice as fast as this other one on the market." We've actually done that. So there's a huge performance increase. Most of those are the type of changes in separating the different file streams from each other deeper down in the system so you get more parallelism it works better on a parallel system. We've drastically improved the performance on Checkdisk. The transaction file system lets you make a transaction across a collection of file changes. We've added shadowing, so you can take a snapshot of something at a point in time and make a backup on the fly. We've done things to the IO subsystem, with tighter integration between a RAID subsystem and caching. How about the registry?
We've added more caching to the registry access, and we pulled apart the locking which is one of the areas of the system we spent a lot of time on trying to improve so it'll run on very large systems, Itanium 64 bit systems and so on, we see a lot of lock contention. So right through the kernel and the IO system we pulled a lot of the locks apart. And the registry, we pushed the locks up a little bit. So the locking is finer grained. Why is there no command line only version?
We're looking longer term to see what can be done, looking at the layers and what's available at each layer and how do we make it much closer to the thing the Linux guys have -- having only the pieces you want running. That's something Linux has that's ahead of us, but we're looking at it. We will have a command line-only version, but whether it'll have all the features in is another matter. A lot of the tools depend on having the graphical interface. Printing, for example, requires all the graphics subsystems because we have the "what you see is what you get" model. You need to have the whole of the display stuff to render it. It's a very tangled subsystem. Are we going back towards two product lines again, with 2003 and XP taking the place of NT and Windows 9X?
They're much more compatible, they're created from the same code base. It's the same application interface, except that the server is extended. The same drivers work in both. Looking from underneath or above, they're the same. That's what we're trying to do, and it certainly wasn't the case with NT and 9X. The embedded product will be built from the same code base. We're moving towards building what we like from a common set of components. It looks really good on PowerPoint! Reality is never quite as good.
Editorial standards