If you have a mysterious problem with a Linux box, try bashing your system with sys_basher

Summary:Finding mysterious crashing problems is often a big challenge. Read on to learn about a free tool you can use to help find those hidden troubles.

I don't really consider myself a Linux guru. I haven't spent as much time working with Linux as I'd like, given all my other responsibilities. It's a weird feeling, because I once pretty much knew all there was to know about UNIX (I wrote kernel code, was a product manager for a very well-known implementation, etc.), but that was a very long time ago.

These days, my use of Linux is simply as a system manager for my own servers, and I'm as much an explorer as anything else. That's why, when I talk with you about Linux, I'll be sharing with you what I consider a "discovery," but what a true Linux guru would probably think is common knowledge. Even so, there are a lot of explorers out there and I hope these tips can help.

I've got one such tip today. I co-locate my servers at Prominic.NET. Before I tell you the rest of the story, I should disclose that these guys are old friends of mine, and I'm a very happy fan of their service.

In any case, I've been having no end of problems with one of the Linux machines I co-lo there. It's a CentOS 5.6 machine and about once a week, it'd crash hard. It's been doing this for months, and we had no idea what was causing it.

In desperation, I asked for help. Eric McCartney over at Prominic took some pity on me and started looking at the system. After doing all the usual swaps and tests, there were no obvious issues.

Finally, Eric loaded up a neat little program called sys_basher, which exercises all the elements of the system...hard. It puts the machine under a strong load and if something can't handle the load, it'll die. Quickly.

As it turned out, the problem was a hard drive. The weird thing is, we hadn't allocated the hard drive in the Linux environment yet. It was simply plugged in, awaiting configuration. But, when that hard drive was plugged in, sys_basher would take the system down almost instantly. When the hard drive was unplugged, the system would run rock solid.

Hard drives, like all system components, are not perfect. But it would have taken far longer to diagnose what was up without sys_basher bashing the system.

Since we removed the drive, we've run fourteen days of testing with sys_basher and the machine has been solid. I think we've found the problem.

So, if you have a mysterious problem with a Linux box, try using sys_basher and see if it'll help you track down the trouble.

Topics: Linux, Hardware, Open Source, Operating Systems, Software

About

In addition to hosting the ZDNet Government and ZDNet DIY-IT blogs, CBS Interactive's Distinguished Lecturer David Gewirtz is an author, U.S. policy advisor and computer scientist. He is featured in The History Channel special The President's Book of Secrets, is one of America's foremost cyber-security experts, and is a top expert on savi... Full Bio

zdnet_core.socialButton.googleLabel Contact Disclosure

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.

Related Stories

The best of ZDNet, delivered

You have been successfully signed up. To sign up for more newsletters or to manage your account, visit the Newsletter Subscription Center.
Subscription failed.