If you have a mysterious problem with a Linux box, try bashing your system with sys_basher
Summary: Finding mysterious crashing problems is often a big challenge. Read on to learn about a free tool you can use to help find those hidden troubles.
I don't really consider myself a Linux guru. I haven't spent as much time working with Linux as I'd like, given all my other responsibilities. It's a weird feeling, because I once pretty much knew all there was to know about UNIX (I wrote kernel code, was a product manager for a very well-known implementation, etc.), but that was a very long time ago.
These days, my use of Linux is simply as a system manager for my own servers, and I'm as much an explorer as anything else. That's why, when I talk with you about Linux, I'll be sharing with you what I consider a "discovery," but what a true Linux guru would probably think is common knowledge. Even so, there are a lot of explorers out there and I hope these tips can help.
I've got one such tip today. I co-locate my servers at Prominic.NET. Before I tell you the rest of the story, I should disclose that these guys are old friends of mine, and I'm a very happy fan of their service.
In any case, I've been having no end of problems with one of the Linux machines I co-lo there. It's a CentOS 5.6 machine and about once a week, it'd crash hard. It's been doing this for months, and we had no idea what was causing it.
In desperation, I asked for help. Eric McCartney over at Prominic took some pity on me and started looking at the system. After doing all the usual swaps and tests, there were no obvious issues.
Finally, Eric loaded up a neat little program called sys_basher, which exercises all the elements of the system...hard. It puts the machine under a strong load and if something can't handle the load, it'll die. Quickly.
As it turned out, the problem was a hard drive. The weird thing is, we hadn't allocated the hard drive in the Linux environment yet. It was simply plugged in, awaiting configuration. But, when that hard drive was plugged in, sys_basher would take the system down almost instantly. When the hard drive was unplugged, the system would run rock solid.
Hard drives, like all system components, are not perfect. But it would have taken far longer to diagnose what was up without sys_basher bashing the system.
Since we removed the drive, we've run fourteen days of testing with sys_basher and the machine has been solid. I think we've found the problem.
So, if you have a mysterious problem with a Linux box, try using sys_basher and see if it'll help you track down the trouble.
Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.
Talkback
RE: If you have a mysterious problem with a Linux box, try bashing your system with sys_basher
PS. Kudos DG.
Hardware Troubleshooting 101
$ man stress
So quick to judge....
Seeing that the drive was unallocated and not set up, your step 2 probably may not run on the drive. It didn't on my unallocated drives until they were set up. However, I use a live cd to run tests on all my systems, such as Knoppix.
RE: quick to judge
"... probably may not ..."
lol, are you sure?
"unallocated" is meaningless. Unpartitioned? Unformatted? Unmounted? A drive is a drive is a drive. If it's plugged and powered on, you can and should run `badblocks` on it.
RE: If you have a mysterious problem with a Linux box, try bashing your system with sys_basher
A smug linux user? Never! He specifically stated that he was not a guru and this tip is for the novice.
"... I consider a ?discovery,? but what a true Linux guru would probably think is common knowledge."
Get over yourself. Go pick a fight with an apple fanboy or something.
RE: If you have a mysterious problem with a Linux box, try bashing your system with sys_basher
No "smug" - a fountain of Linux knowledge willing and able to help.
"He specifically stated that he was not a guru and this tip is for the novice."<br><br>This is the worst direction you could possibly send a novice in. `stress` is a well known standard tool for stress testing a system and it's already available in all of the standard package managers. *NOTE HERE* that stress testing is *NOT* the same thing as diagnosing hardware failures. The author needed the latter anyway.<br><br>You don't send a novice to go download some wild, one-off package that no one's heard of that might need to be be custom compiled from source on a server that's already failing.<br><br>Package Managers are your friends.
RE: If you have a mysterious problem with a Linux box, try bashing your system with sys_basher
Who cares if you're not impressed. It worked for him, so it's good enough. Not all of us Linux users are omnipotent
RE: If you have a mysterious problem with a Linux box, try bashing your system with sys_basher
The badblocks manpage states:
"Important note: If the output of badblocks is going to be fed to the e2fsck or mke2fs programs, it is important that the block size is properly specified, since the block numbers which are generated are very dependent on the block size in use by the filesystem. For this reason, it is strongly recommended that users not run badblocks directly, but rather use the -c option of the e2fsck and mke2fs programs."
In regards to sys_basher, most newbies would not know how to compile a C program, must less set up their Linux environment to enable such a task. And, it's probably not a good idea to suggest to newbies that they leave the confines of their vetted distro repository and install alien software like sys_basher. But, if they know C programming, even if they are Linux newbies, they can vet the code themselves, provided they understand hex shellcode, a classic method for inserting malware into code.
Most problems, hardware or software, usually reveal themselves in the logs residing in /var/log, and the software to read them is readily available on all Linux systems.
RE: If you have a mysterious problem with a Linux box, try bashing your system with sys_basher
"In regards to sys_basher, most newbies would not know how to compile a C program, must less set up their Linux environment to enable such a task. And, it's probably not a good idea to suggest to newbies that they leave the confines of their vetted distro repository and install alien software like sys_basher."
He did say that he is running CentOS. The application in question IS in the RPM package system for CentOS and Fedora. David didn't even run it himself, at least initially. "Eric McCartney over at Prominic" actually loaded the app.
I saw no reference to any "Newbie" being instructed to compile and run the app on their own. Most would first ask for assistance from someone they know or online.
RE: If you have a mysterious problem with a Linux box, try bashing your system with sys_basher
RE: If you have a mysterious problem with a Linux box, try bashing your system with sys_basher
RE: If you have a mysterious problem with a Linux box, try bashing your system with sys_basher
SPAM!
RE: If you have a mysterious problem with a Linux box, try bashing your system with sys_basher
DFTT
RE: If you have a mysterious problem with a Linux box, try bashing your system with sys_basher
That was probably his (unintentional) point. This had nothing to do with Linux, yet the article and headline make it seem like it does.
RE: If you have a mysterious problem with a Linux box, try bashing your system with sys_basher
Lulz. DFTT indeed.
Say that to the folks with ATI cards. BSD IS BEZT. AMG H8 LINAKS SO MUCH. ARG.
You're as bad as a hardcore anti-MS zealot. You think we don't have feelings? You honestly think you can mess with our heads and emotions? You think we don't care?
We don't care. Nothing to see here, move along.
Great article DG, and thank you for the comments. Great insights asmoore82.
zxgdiww 57 nmq