More Windows 7 corruption and repair woes
Summary: Recently, we've been seeing a noticeable increase with Windows 7 and "repair mode" which is launched automatically at boot time. During this latest increase, Windows 7 will launch the automatic repair, which when the user attempts to allow it to repair, ends up failing and results in a boot loop where the repair mode comes up and Windows cannot boot into the regular shell any longer.
Recently, we've been seeing a noticeable increase with Windows 7 and "repair mode" which is launched automatically at boot time. During this latest increase, Windows 7 will launch the automatic repair, which when the user attempts to allow it to repair, ends up failing and results in a boot loop where the repair mode comes up and Windows cannot boot into the regular shell any longer.
I grabbed a Windows 7 DVD with these latest occurrences, and tried to get Windows to repair itself, hoping to avoid a lot of user downtime, and maybe try and identify the cause of this latest increase. I do know that it is not caused by Windows updates, as WSUS is used and no updates have been released recently to the organization. I do have to admit the repair tools are very easy to follow, all completely wizard-driven, however seldom have I seen them actually work or fix the problem at hand.
In the latest cases, I tried booting the PC in Safe Mode as well, hoping that I can get into Windows to look around at the event logs. However even with that, the PC reboots itself just before the Windows shell would come up.
So, for now we've been faced with re-imaging PCs to get users back up and running quickly. This is much faster than re-installing Windows manually or trying to diagnose the list of error codes provided by Microsoft's tools. If Microsoft's own repair tools could simply fix the issue, it would be a huge time-saver as the system could be fixed within minutes, rather than hours that it takes to re-image a system and archive and restore user data. The problem is, I've rarely seen the Microsoft repair tools actually fix anything.
Before Windows apologists can comment on this post, I will provide the steps taken to attempt the offline repair on one of the sample PCs with this latest issue, to demonstrate whole hearted attempts to use Microsoft's repair tools.
1. Boot the Windows 7 DVD. 2. Select English language, and click Next. 3. Click "Repair your computer". 4. The repair tool searches for Windows installations, and shows the one present on the system. 5. Select the option "Use recovery tools that can help fix problems starting Windows" and click Next. 6. Next there are 5 options displayed: Startup Repair, System Restore, System Image Recovery, Windows Memory Diagnostic, Command Prompt. Since the issue at hand is a boot issue, I selected "Startup Repair". 7. The recovery tool says that it is searching for problems, then after about a minute, says "Startup Repair could not detect a problem".
But, there's an option for "View diagnostic and repair details", so I click that. I figure that there HAS to be something useful in the log. Well, the log is detailed. It shows the primary disk in the system and the partition, along with the tests performed which include: disk, OS, registry, and other volume information. Then at the bottom it says "Root cause found: Unspecified changes to system configuration might have caused the problem. Repair action: System files integrity check and repair. Result: Failed. Error code = 0x490. Time taken = 784575 ms." So, why can't the repair tools, actually repair something? Can't system files be copied from the source DVD and restored, without doing a full re-installation? Apparently not.
So, back to the main menu we go, and this time "System Restore" is tried. I click next and select the restore point from 6 days ago. A prompt came up, "Once started, the system restore cannot be interrupted. Do you want to continue?", and I select Yes. Immediately the main menu comes back up and a message pops up saying "Restoring files". After a couple of minutes of churning, the error comes up "System Restore did not complete successfully. Your computer's system files and settings were not changed. Details: An unspecified error occurred during System Restore. (0x800700b7)". Click "Close" and the system reboots itself. A minor annoyance, I just have to boot back into the recovery tools. Two other restore points were tried and the same error resulted.
So, after about an hour of messing around with a single PC, we are in the same boat and nothing accomplished. No cause has been found for this latest round of corruption and boot problems. This time as I mentioned above, is better spent re-imaging a new system. For those who say that Windows 7 has superior stability or is miles above XP, are only kidding themselves. Even after a year of deploying Windows 7 over XP, there is still little return on investment seen. In fact, I think there is still a deficit because of the resources used not only for troubleshooting ongoing Windows 7 problems, but for testing and re-purchasing incompatible applications.
In comparison, GNU/Linux has similar repair software that is used by booting the repair DVD for the distribution. It scans the installed packages and repairs as necessary. Since the Linux boot sequence is far less complex than Windows, the kernel can at least boot and get the user to a command prompt (in case X11 can't start), allowing for further troubleshooting of log files. Fortunately, I haven't needed to run a repair like this for GNU/Linux in a long long time. Corruption and repairs just aren't needed like they are in Windows. But, I'm guessing the latest GNU/Linux repair DVDs are very efficient at fixing issues, if any do come up. GNU/Linux keeps most everything at the filesystem level, and uses a very stable filesystem on top of that (EXT3/EXT4 commonly) which overall provides top notch stability as GNU/Linux is already well known for.
Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.
Talkback
jw
Thanks for the feedback and humorous story. And I have to admit, I'm not surprised that even a brand new PC with Windows doesn't run properly. As I mentioned before in another post, we've seen PCs that were installed with a fresh copy of Windows 7, start to behave with this boot loop problem about 2 weeks after they were put into use. 2 weeks! That's hardly any time at all. So far there have been several cases of this. In some cases, the problems went away like magic and Windows would boot up like normal, and other times it would be caught in an infinite reboot loop, just as you saw. And to top it off, I just posted an article recently about GNU/Linux running for 10 years on the original installation of the OS. Wow, what a difference.
New Comp SOS~ hELP PLEASE~
I would have to agree with you that the registry may be at fault here. My point of the article was to again demonstrate that Microsoft's own tools are inadequate for fixing problems. You back this up by pointing out a 3rd party solution that picks up where Microsoft leaves off, which is typical for Microsoft software.
In fact, I find the documentation for ERUNT somewhat humorous in itself, when reading statements in it such as:
"And since the registry is quite sensitive to
corruption, it is very advisable to backup its according files from
time to time."
"In Windows NT and 2000, the registry is never backed up
automatically, and in XP it is backed up only as part of the bloated
and resource hogging System Restore program which cannot even be used
for a "restore" should a corrupted registry prevent Windows from
booting. It has also become impossible to copy the necessary files,
now called "hives" and usually named DEFAULT, SAM, SECURITY, SOFTWARE,
SYSTEM in the SYSTEM32\CONFIG folder, to another location because they
are all in use by the OS."
"Note: The "Export registry" function in Regedit is USELESS (!) for
making a complete backup of the registry. Neither does it export the
whole registry (for example, no information from the "SECURITY" hive
is saved), nor can the exported file be used later to replace the
current registry with the old one. Instead, if you re-import the file,
it is merged with the current registry without deleting anything that
has been added since the export, leaving you with an absolute mess of
old and new entries."
These statements only back up my previous points about Windows lacking tools, or providing tools that don't work. Microsoft could write a reliable registry backup utility that works in System Restore, but clearly they do not. And, I've pointed out on many occasions that the registry design itself is a single point of failure on top of the filesystem which is another single point of failure, therefore doubling the chance of failure on a Windows system. If you are right about a corrupt registry causing System Restore to malfunction, this provides yet another example of this. In GNU/Linux, there is no registry. Parameters and configs are kept in flat text files usually, allowing for a much more simplistic (and reliable) design. GNU/Linux keeps mostly everything at the filesystem level, and uses a very stable filesystem at that (EXT3/EXT4 typically), resulting in a very stable operating system as it is known.
As with most Windows problems, it doesn't always happen on every PC, but happens on a small percentage of them repeatedly. You mention 1400 PCs with Windows 7, do you oversee those PCs? I am curious as to how you are basing your conclusion of "1400 PCs without any problems". I find that a little far fetched. Simply doing an Internet search regarding the problems stated above, will result in a countless number of results, where you can see examples of others having the same exact issues all over.
"The cause is a CRC error that is easily fixed that way in about 15-20 minutes."
With this example, you've basically pointed out that the NTFS filesystem is unstable and needs repair. Why does this happen repeatedly, even on PCs that Windows was installed on, just weeks prior?
So the computer went back to the manufacturer under warranty for a new drive. Of course I spent time recovering his masses of data with the said Live Linux disk; and subsequently applying all the updates, installing software all over again and reloading his data.
Not Windows 7 fault but very trying just the same, just a very few weeks after having done all that on his new purchase.
The last time I tried it, after about 20 minutes it said it couldn't continue. That was it, no hint of what the problem was. Useless!
Thanks for the feedback. I too figured maybe the issue was due to a bad hard disk at first, however in every case of the issue here, simply re-imaging the PC fixed the issue and it was fine after that.
47674 :
I totally agree! Same exact issue here. So, now I know there are at least two of us that conclude that the System Restore fails to work as advertised :) Thanks for the comment.
That is an excellent idea to separate the user data from the system binaries and data. I suppose it would be even more ideal to repoint the entire profile directory to a folder the other data partition, if that is possible in Windows. For a home solution, this makes perfect sense, and would require some work. In fact, Microsoft recommends "Folder Redirection" for pointing various subfolders in the profile to alternative locations. But what about other data like Firefox and other items that aren't contained in the limited list of folders allowed for redirection? Ooops, I guess Microsoft forgot about all of that. But, in theory you could use a "junction point", which is basically a symlink to point the c:\users folder to an alternate location. For a corporate environment, we don't even mess with re-installing Windows, we simply re-image a new PC and start from scratch which handles two problems in one (gives the user a clean installation and newer hardware at the same time), and requires less of the technician's time in troubleshooting.
This is what I really love about GNU/Linux, because separating the user profile data from the operating system is super easy, compared to Windows. Simply have the /home filesystem mount to another partition, or symlink the /home directory to a folder on another partition, and you are done! When a new account is created on the PC, its folder is automatically created in the alternative location. And restoring profiles to the /home directory is a snap as well, unlike Windows 7 which requires a bunch of fiddling around in the registry. I usually separate user profile data from the operating system in GNU/Linux only for the purpose of easy backup (the "/" filesystem can be easily backed up or captured using Clonezilla, and the "/home" filesystem can be backed up with rsync separately).
Thanks for the feedback.
So you are saying that we should spend even more money, and pay highly trained experts to come in and put out fires all of the time? A number of us have MCP training, including myself, so it's not a training issue anyway. It's an issue with software failing to work out of the box as it was designed to do. I've even been in MCP classes and the software fails to work in the labs there too. The instructor usually admits to the failure, and everybody just looks past it and continues on. I don't see how anybody would be able to prevent the issues mentioned above. We've followed the book at setting up our systems, yet a considerable number of them fail to work properly out of the box. While the numbers may skew things somewhat (higher number of Windows workstations vs. GNU/Linux workstations), I consistently see one operating system failing, while another not failing. In fact I've seen this over many years, and in many different environments. I've posted Linux problems as well, same that I do with Windows, and provided solutions because there were some. Once in a while I will come across something in Windows that is better or as good as GNU/Linux, and I post it. But overall, I see the Windows problems a LOT more, and many times they result in threads that have the same issues listed, with dead ends. And yes, I do know that Microsoft offers support at an expensive hourly rate. We have GNU/Linux desktops and we do not see the issues that we do with Windows. I've also deployed GNU/Linux desktops for friends and family, and guess what, no more calls from them for me to remote in and fix things every few weeks, like they did with Windows! So even with different samplings, the results are still enough to make conclusions. The trends are there, it's up to us to see and notice them.