"Geeks rule!"? Yes - but what that means depends.

Summary: A small homily about pain - and cards - and the spread of Linux

An odd experience this week: I wanted to get a bunch of people collecting roughly similar data (on donors) to produce samples for me that were both randomly selected and geospatially representative. Since that's oxymoronic, what I actually asked them to do was to stratify by district and then select randomly within each stratum.

The striking thing about the exercise was that an easy majority have moved the database for this to MySQL on Linux with holdouts on Solaris (mostly also MySQL), HP-UX, and various Microsoft configurations - but the unexpected thing was that none of the Unix people had any difficulty either understanding it or doing it; while the Wintel people equally unanimously wanted meetings, paperwork, "a better understanding of the requirements", and in something like three out of four cases additional monies from their bosses before they could see about getting it done.

I was contemplating the difference between the Wintel marketing image as the solution for do it yourselfers who want to avoid having to deal with systems managers and the reality of the inflexibility and burdens its protagonists impose on users when, just as I was explaining the joys of the formulation "I'm from Microsoft and I'm here to help", some black ice shifted my focus to practicing breakfalls on the sidewalk.

So as I'm laying there wondering where the phone went, one of the PC people who'd come out behind me looked down all concerned like to ask "did you fall?" Well, I'm a Jeff Foxworthy fan, so I dug out somebody's business card from a previous meeting and handed it to him.

I've no idea whose card it was or what he made of it - but the analogy between what really happens when businesses replace their Unix infrastructures with Microsoft people and that slickly invisible ice on the sidewalk? Yep: that I'll buy into.

Topics: Data Management, Data Centers, Enterprise Software, Microsoft, Open Source, Operating Systems, Software

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.

Talkback

30 comments
Log in or register to join the discussion
  • That black ice can be a b1tch

    But I'm sure you landed safely on your (very hard) head! If you had written something positive about M$, I would have been concerned . . .
    Roger Ramjet
    • And white ice is not?

      nt
      Agnostic_OS
      • Black ice is harder to see...

        ...and therefore a bigger b1tch when driving or walking. I believe it's
        actually thin, clear ice, usually in small patches, which appears black on
        blacktop roads and hence is very hard to see. White ice is easy to see,
        and tends to be in larger patches, and therefore no where near as much
        of a b1tch.

        So, for all the PC police from warm climates out there, this is *not* a
        racially motivated term. Color is a physical attribute. Get over it.

        I obviously should not stop at the computer in the morning before
        having coffee.
        Erik Engbrecht
        • Sounds icist to me!

          nt
          Agnostic_OS
  • How do you randomly select data in SQL?

    A *nix admin who doesn't know can just dump to text and
    use any number of little scripting languages, but what's a
    Windows admin to do? He need a DBA who knows archaic
    SQL or a software engineer who can right an application to
    do it.
    Erik Engbrecht
    • Here is one way (line) how a Windows admin could do it

      <i>A *nix admin who doesn't know can just dump
      to text and use any number of little scripting
      languages</i>

      Well, so could a Windows admin. Ever heard of
      PowerShell?

      <i>but what's a Windows admin to do? He need a
      DBA who knows archaic SQL or a software
      engineer who can right an application to
      do it</i>

      Or he could just do

      <code>Invoke-Sqlcmd -ser dbserver -dat dbname
      "select * from sometable" | random -count 100 |
      ConvertTo-Csv</code>

      Yup. No need to dump to a file, just pipe it
      directly to the <i>random</i> cmdlet which will
      choose a number of random objects from the
      stream. Finally the records are exported in csv
      format. No archaic parsing.
      honeymonster
      • What???

        The command line in windows?

        It's amazing to how you say that after telling us for so long that *nix doesn't stand a chance against windoze because it forces you to use the command line.
        The Mentalist
        • Did I say that?

          or are you setting up a strawman?

          If you heard this argument I believe it must
          have been referring to the instances where you
          *can not* accomplish a task in Linux/Unix
          without resorting to the CLI and editing text
          configuration files (in all sorts of
          <i>strange</i> formats - from INI over the
          Apache pseudo-xml to peculiar colon-delimited
          passwords files).

          In Windows you <i>can</i> perform <b>all</b>
          administrative tasks through the GUI.
          <i>Automating</i> those tasks may require
          scripting. This is where PowerShell comes in.
          honeymonster
      • How's that different?

        Other than that the commands are a little different, that's exactly the
        type thing I would expect a *nix admin to do.

        It's good to see that Windows 2008 Server is finally catching up with
        Unix from the seventies, and that MS is even being gracious enough to
        backport such innovations to earlier versions of Windows.

        What's a fancy new tool and associated techniques in Windows land is
        a decades old culture in Unix land. The task may be equally easy in
        Windows, but it will be years before average Joe Windows admin gets
        out of his GUI and uses them - if he ever does given the Windows
        culture.
        Erik Engbrecht
        • Not much

          Windows lacked a good admin-oriented scripting
          for many years. There as wsh with JScript and
          VB scripting - but they lacked the useful
          piping feature of *sh shells.

          But PowerShell is more than just catching up.
          It is waaaay more consistent and leverages the
          object-oriented nature of Windows. Thanks to
          that PowerShell accomplishes the same as bash
          or zsh with far fewer and simpler commands.

          That's because PS has 1-upped the *nix shells:
          Object piping means that commands become
          simpler (to implement as well as to learn) and
          PS scripts become easier to write and read
          because there is no need for strange parsing
          and formatting to make ends meet (no need for
          awk, sed, cut etc).
          honeymonster
          • Sounds complicated

            Object orientation is often more of a problem than a solution.

            I take my statement back. MS hasn't managed to catch up to *nix from
            the seventies. Thanks for saving me the time of looking a PowerShell.
            After your first post of was seriously considering it.
            Erik Engbrecht
          • Speaking of *nix culture and Windows culture

            You just demonstrated why *nix will forever be
            stuck in a 1970 mindset where me-us-everyone
            security is sufficient, where the only
            securable objects are the ones in the file
            system, where every command overloads what -f
            means and where every application *must* define
            its own configuration file format.

            Dismissing a tool by homing in on a single
            phrase is sooooo mature. Way to go.
            honeymonster
          • If you haven't noticed...

            ...pretty much every "object" in Unix is represented as "a file." Which
            would make pretty much ever object securable.

            My objection to OO doesn't have anything to do with Unix versus
            Windows. Plenty of Unix bigot programmers use languages like C++,
            Java, Python, Ruby, and even C# - all of which are object oriented to
            varying degrees and according to varying definitions.

            My objection is that languages that are OO has significantly worse
            compositional properties and weaker type systems than functional
            languages like the ML family and Haskell while lacking the flexibility
            and simplicity of text streams across a pipe. That's not to say I think
            OO is useless, especially in hybrid OO languages (Python, Ruby, Scala,
            etc) where you aren't completely forced into the object paradigm.

            For administrative scripting tasks a type system just gets in the way,
            and most of the time so do things like classes, even if they are
            dynamic. A prototype-based OO system like JavaScript's might
            actually work well, but I've never tried it for such tasks.
            Erik Engbrecht
          • I object! Take a look

            Take a look at the man pages of some of the
            most often used commands in *nix: ls and ps.

            Now look closely at how many of those options
            1) filter on some property not usually reported
            in the listing
            2) control output format so that both humans
            and next-in-pipeline commands may understand
            the output.
            3) select related information, e.g. threads.

            ps alone has some 60+ options!

            Now consider the ps command in PowerShell:

            <code>
            Get-Process
            -Name <string[]>
            -ComputerName <string[]>
            -FileVersionInfo [<SwitchParameter>]
            -Id <Int32[]>
            -InputObject <Process[]>
            -Module [<SwitchParameter>]
            </code>

            That's it.

            Now, is the *nix command more powerful? Nah -
            most of what it does is about output formatting
            <i>because some other command should be able to
            parse</i> it. Some options control whether a
            header should be printed. Other options control
            what information goes into the output. Many
            options control how that information is
            displayed, e.g. dates, sizes, string
            (delimited?) etc.

            In PowerShell, the ps command output
            <i>objects</i>. Another command can readily
            consume them <i>without needing to sed, cut or
            awk</i> anything. See, the object-orientation
            lifts the formatting and output selection out
            of the commands. They become simpler to
            implement and simpler to understand. They do
            "one thing only" - unlike the *nix beasts which
            must find the information and <i>then</i>
            format in text.
            honeymonster
          • So answer a question for me...

            I'm writing a script and I want do what in an OO language would be
            effectively introducing a new class. For example, in the text processing
            way, I might have a script that takes tabular data, removes some
            columns, and adds some new ones in. Doing this is trivial using Unix
            style scripts because it's just text. How do I do it in PowerShell?
            Erik Engbrecht
          • Fair question.

            <i>I'm writing a script and I want do what in
            an OO language would be effectively introducing
            a new class. For example, in the text
            processing way, I might have a script that
            takes tabular data, removes some columns, and
            adds some new ones in. Doing this is trivial
            using Unix style scripts because it's just
            text. How do I do it in PowerShell?</i>

            First off you have to realize that you seldom
            deal with raw text in PowerShell. Why? Because
            the built-in cmdlets all return stream og
            <i>objects</i> - not text. Hence I don't need
            to (re)parse the output from ps or ls - I can
            just work with the properties, methods and even
            events directly from PS:

            <code>ps|where{$_.WorkingSet -gt 100MB}</code>

            <i>ps</i> returns a stream of Process objects.
            Process happens to have a property called
            WorkingSet which indicates how much memory it
            uses. The <i>where</i> cmdlet filters using a
            script block - <i>$_</i> represents the
            iterator.

            So the above command tells me which processes
            uses more than 100MB. Note that the properties
            are typed. 100MB is actually a number.

            The above command still returns a stream of
            Process objects. When they "fall off" the end
            of the pipeline they are being rendered
            according to a default format, unless I decide
            otherwise. Key point is that the objects still
            retain all their properties and methods. Unlike
            the *nix ps command where the only information
            forwarded is the text that you see.

            I could do

            <code>ps|where{$_.WorkingSet -gt
            100MB}|fl</code>

            which would list all processes in <i>list</i>
            format. This will list each property on a new
            line and an extra blank line between each
            process.

            <code>ps|where{$_.WorkingSet -gt 100MB}|ft
            Name,WorkingSet,Threads</code>

            which would list 3 columns, the "threads"
            column with multiple values (thread IDs).

            I could do

            <code>ps|where{$_.WorkingSet -gt
            100MB}|foreach{$_.WaitForExit()}</code>

            Which would wait for all of the "big" process
            to terminate before the script continued.

            Keep in mind that "text" is just special
            objects. In PS, text is streams of strings. If
            I <i>really</i> wanted to process text PS has
            ample tools for that: I can import/export
            delimited files with and without headers, I can
            split strings on delimiters into arrays, join
            them again or drop down to regular expressions.

            I'm not trying to dodge the question, but you
            have to be more specific if I am to demonstrate
            the PowerShell equivalences to awk, sed, cut
            etc.
            honeymonster
      • awwwww..

        Powershell?! That's so cute. I just want to grab your cheeks and squeeze em.

        A for effort!
        civikminded
        • Agreed

          Wintel people singing the praises of powershell while dissing Unix for the CLI are both tragic and funny in a sad kind of way.

          Maybe MS should try a new slogan: "Making the past licensible again" or "Hide the origin" or "Facing Forward, Marching Backwards" ("Hypocrites is us"? )

          And, of course, in reality, the line quoted far above is a CLI demo, not something that would work. One of the fun consequences of OO in this case is that the job explodes in complexity once you start scripting the sql command (multiple tables) and structuring the output line. - and note that you can't adopt either of the obvious solutions without complicating the command beyond redemption because you have to stratify and chose randomly within each stratum, not choose randomly from the output after stratify and select.) i.e. you'll be doing a foreach one way or the other.
          murph_z
          • Murph, please show me

            How to do this with your favorite Unix CLI? If you read the post I was
            answering.

            Erik was claiming that this about a Windows admin: <i>"He need a DBA who knows
            archaic SQL or a software engineer who can right an application to
            do it."</i>

            Since Erik obviously is challenged on SQL I decided to keep the SQL <b>really
            simple</b>.

            You are of course correct that if the

            If the admin is allowed to know a little more SQL (or allowed to look up the
            doc) he could do the following instead (separating out the sql for clarity):

            <code>
            $sql = "select * from table t1 tablesample(100 rows) inner join table2 t2 on
            t1.key=t2.key"

            Invoke-Sqlcmd -ser sbserver -dat dbname $sql | Export-Csv "sample.csv"

            Send-MailMessage -t "murph@unixzealots.org" -sub "Sample data" `
            -f "admin@windowszealots.com" `
            -body "hi murph, this was too easy." `
            -smtp smtp.windowszealots.com`
            -att sample.csv
            </code>

            Three lines and you have your sample data. It should be obvious that the sql
            scales to joins over many tables.

            So, murph, how would you expect your Unix admins to go about this with your
            favorite stack?
            honeymonster
          • mail honeym &lt;assoc_array

            The least error prone, and least complex, way I know of is to make an sql script file which dumps combined records into one file for each district and a perl script which selects a few for output to a common file.

            The point of this approach is neither elegance nor efficiency, but functional separation, testing, and replication. In fact this is an example of something I do all the time: using the dumbest approach simply because it makes it easy to guarantee that the result is right.

            You can do exactly the same (in fact the perl would be portable) with Wintel, the point of the blog wasn't that this couldn't be done or even is hard to do, the point was the wintel people didn't want to go an extra step without lots of formality, hand wringing, and <EM>their assertion of control</em> over user decision making in the systems context - where the Unix people responded to the ph calls from their bosses by just getting the job done.

            (In thinking about this, the most elegant and efficient way, assuming one had tens of thousands of records per district and hundreds of districts, would be to pipe all tables to a perl associative array and output the final file from that - not that I would actually do this: with today's gear even horrendous SQL queries often take only a few seconds - much less than checking that the array includes its the end points :) )
            murph_z