If you have a mysterious problem with a Linux box, try bashing your system with sys_basher

If you have a mysterious problem with a Linux box, try bashing your system with sys_basher

Summary: Finding mysterious crashing problems is often a big challenge. Read on to learn about a free tool you can use to help find those hidden troubles.

SHARE:

I don't really consider myself a Linux guru. I haven't spent as much time working with Linux as I'd like, given all my other responsibilities. It's a weird feeling, because I once pretty much knew all there was to know about UNIX (I wrote kernel code, was a product manager for a very well-known implementation, etc.), but that was a very long time ago.

These days, my use of Linux is simply as a system manager for my own servers, and I'm as much an explorer as anything else. That's why, when I talk with you about Linux, I'll be sharing with you what I consider a "discovery," but what a true Linux guru would probably think is common knowledge. Even so, there are a lot of explorers out there and I hope these tips can help.

I've got one such tip today. I co-locate my servers at Prominic.NET. Before I tell you the rest of the story, I should disclose that these guys are old friends of mine, and I'm a very happy fan of their service.

In any case, I've been having no end of problems with one of the Linux machines I co-lo there. It's a CentOS 5.6 machine and about once a week, it'd crash hard. It's been doing this for months, and we had no idea what was causing it.

In desperation, I asked for help. Eric McCartney over at Prominic took some pity on me and started looking at the system. After doing all the usual swaps and tests, there were no obvious issues.

Finally, Eric loaded up a neat little program called sys_basher, which exercises all the elements of the system...hard. It puts the machine under a strong load and if something can't handle the load, it'll die. Quickly.

As it turned out, the problem was a hard drive. The weird thing is, we hadn't allocated the hard drive in the Linux environment yet. It was simply plugged in, awaiting configuration. But, when that hard drive was plugged in, sys_basher would take the system down almost instantly. When the hard drive was unplugged, the system would run rock solid.

Hard drives, like all system components, are not perfect. But it would have taken far longer to diagnose what was up without sys_basher bashing the system.

Since we removed the drive, we've run fourteen days of testing with sys_basher and the machine has been solid. I think we've found the problem.

So, if you have a mysterious problem with a Linux box, try using sys_basher and see if it'll help you track down the trouble.

Topics: Linux, Hardware, Open Source, Operating Systems, Software

About

David Gewirtz, Distinguished Lecturer at CBS Interactive, is an author, U.S. policy advisor, and computer scientist. He is featured in the History Channel special The President's Book of Secrets and is a member of the National Press Club.

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.

Talkback

17 comments
Log in or register to join the discussion
  • RE: If you have a mysterious problem with a Linux box, try bashing your system with sys_basher

    I said the day David G. start writing about Linux, you know it has arrived.

    PS. Kudos DG.
    Return_of_the_jedi
  • Hardware Troubleshooting 101

    Step 1: memtest86+<br>Step 2: `badblocks -sv`<br><br>If And Only If (IFF) the above procedure would _not_ have revealed the same problem, then I'll be impressed with sys_basher. On 2nd thought, _still_ not impressed - you still had to _guess_ at which drive to leave out and re-run the tests. Not very "guru" at all.

    $ man stress
    asmoore82
    • So quick to judge....

      @asmoore82
      Seeing that the drive was unallocated and not set up, your step 2 probably may not run on the drive. It didn't on my unallocated drives until they were set up. However, I use a live cd to run tests on all my systems, such as Knoppix.
      linux for me
      • RE: quick to judge

        @linux for me
        "... probably may not ..."

        lol, are you sure?

        "unallocated" is meaningless. Unpartitioned? Unformatted? Unmounted? A drive is a drive is a drive. If it's plugged and powered on, you can and should run `badblocks` on it.
        asmoore82
    • RE: If you have a mysterious problem with a Linux box, try bashing your system with sys_basher

      @asmoore82
      A smug linux user? Never! He specifically stated that he was not a guru and this tip is for the novice.

      "... I consider a ?discovery,? but what a true Linux guru would probably think is common knowledge."

      Get over yourself. Go pick a fight with an apple fanboy or something.
      KenoshaSysAdmin
      • RE: If you have a mysterious problem with a Linux box, try bashing your system with sys_basher

        @KenoshaSysAdmin
        No "smug" - a fountain of Linux knowledge willing and able to help.

        "He specifically stated that he was not a guru and this tip is for the novice."<br><br>This is the worst direction you could possibly send a novice in. `stress` is a well known standard tool for stress testing a system and it's already available in all of the standard package managers. *NOTE HERE* that stress testing is *NOT* the same thing as diagnosing hardware failures. The author needed the latter anyway.<br><br>You don't send a novice to go download some wild, one-off package that no one's heard of that might need to be be custom compiled from source on a server that's already failing.<br><br>Package Managers are your friends.
        asmoore82
    • RE: If you have a mysterious problem with a Linux box, try bashing your system with sys_basher

      @asmoore82

      Who cares if you're not impressed. It worked for him, so it's good enough. Not all of us Linux users are omnipotent
      KBot
    • RE: If you have a mysterious problem with a Linux box, try bashing your system with sys_basher

      @asmoore82
      The badblocks manpage states:
      "Important note: If the output of badblocks is going to be fed to the e2fsck or mke2fs programs, it is important that the block size is properly specified, since the block numbers which are generated are very dependent on the block size in use by the filesystem. For this reason, it is strongly recommended that users not run badblocks directly, but rather use the -c option of the e2fsck and mke2fs programs."

      In regards to sys_basher, most newbies would not know how to compile a C program, must less set up their Linux environment to enable such a task. And, it's probably not a good idea to suggest to newbies that they leave the confines of their vetted distro repository and install alien software like sys_basher. But, if they know C programming, even if they are Linux newbies, they can vet the code themselves, provided they understand hex shellcode, a classic method for inserting malware into code.

      Most problems, hardware or software, usually reveal themselves in the logs residing in /var/log, and the software to read them is readily available on all Linux systems.
      GreyGeek77
      • RE: If you have a mysterious problem with a Linux box, try bashing your system with sys_basher

        @GreyGeek77
        "In regards to sys_basher, most newbies would not know how to compile a C program, must less set up their Linux environment to enable such a task. And, it's probably not a good idea to suggest to newbies that they leave the confines of their vetted distro repository and install alien software like sys_basher."

        He did say that he is running CentOS. The application in question IS in the RPM package system for CentOS and Fedora. David didn't even run it himself, at least initially. "Eric McCartney over at Prominic" actually loaded the app.

        I saw no reference to any "Newbie" being instructed to compile and run the app on their own. Most would first ask for assistance from someone they know or online.
        rstanley@...
      • RE: If you have a mysterious problem with a Linux box, try bashing your system with sys_basher

        @GreyGeek77<br>You seem to be trying to make the point here that _no-one_ should _ever_ run badblocks...<br><br>Epic logic fail. Read your own quote more carefully.<br><br>_If_ the output ... is going to be fed ... it is strongly recommended that users _not_ run badblocks directly.<br><br>We're not trying to feed any e2fsck or mke2fs here; just diagnosing hardware troubles. `badblocks` is your friend.
        asmoore82
  • RE: If you have a mysterious problem with a Linux box, try bashing your system with sys_basher

    Impossible! I've been told that linux doesn't crash. Obviously you are not running linux. However if your linux box did crash despite what the linux people want you to believe, then the best action would be to format the disk to wipe it clean then install one of the BSDs on it if you need a real UNIX environment instead of some hacked up clone of it. Its free, stable, and secure.
    LoverockDavidson_-24231404894599612871915491754222
    • RE: If you have a mysterious problem with a Linux box, try bashing your system with sys_basher

      @LoverockDavidson_

      SPAM!
      KBot
    • RE: If you have a mysterious problem with a Linux box, try bashing your system with sys_basher

      @LoverockDavidson_ <br>"I've been told that linux doesn't crash."<br><br>BIG +1.<br><br>If you could actually be bothered to read the article you would know it was a Hardware Failure, *NOT* Linux's problem.<br><br>But of course you didn't read the article, you've got so much more trolling to do and so little time - gotta move on quickly.
      asmoore82
      • DFTT

        No text needed
        grant@...
      • RE: If you have a mysterious problem with a Linux box, try bashing your system with sys_basher

        @asmoore82

        That was probably his (unintentional) point. This had nothing to do with Linux, yet the article and headline make it seem like it does.
        aep528
    • RE: If you have a mysterious problem with a Linux box, try bashing your system with sys_basher

      @LoverockDavidson_

      Lulz. DFTT indeed.

      Say that to the folks with ATI cards. BSD IS BEZT. AMG H8 LINAKS SO MUCH. ARG.

      You're as bad as a hardcore anti-MS zealot. You think we don't have feelings? You honestly think you can mess with our heads and emotions? You think we don't care?

      We don't care. Nothing to see here, move along.

      Great article DG, and thank you for the comments. Great insights asmoore82.
      CommonOddity
  • zxgdiww 57 nmq

    qdxocr,wopohlcc98, cripv.
    bdfwekrdfe3701-24379006336416627462375061784412