"Geeks rule!"? Yes - but what that means depends.
Summary: A small homily about pain - and cards - and the spread of Linux
An odd experience this week: I wanted to get a bunch of people collecting roughly similar data (on donors) to produce samples for me that were both randomly selected and geospatially representative. Since that's oxymoronic, what I actually asked them to do was to stratify by district and then select randomly within each stratum.
The striking thing about the exercise was that an easy majority have moved the database for this to MySQL on Linux with holdouts on Solaris (mostly also MySQL), HP-UX, and various Microsoft configurations - but the unexpected thing was that none of the Unix people had any difficulty either understanding it or doing it; while the Wintel people equally unanimously wanted meetings, paperwork, "a better understanding of the requirements", and in something like three out of four cases additional monies from their bosses before they could see about getting it done.
I was contemplating the difference between the Wintel marketing image as the solution for do it yourselfers who want to avoid having to deal with systems managers and the reality of the inflexibility and burdens its protagonists impose on users when, just as I was explaining the joys of the formulation "I'm from Microsoft and I'm here to help", some black ice shifted my focus to practicing breakfalls on the sidewalk.
So as I'm laying there wondering where the phone went, one of the PC people who'd come out behind me looked down all concerned like to ask "did you fall?" Well, I'm a Jeff Foxworthy fan, so I dug out somebody's business card from a previous meeting and handed it to him.
I've no idea whose card it was or what he made of it - but the analogy between what really happens when businesses replace their Unix infrastructures with Microsoft people and that slickly invisible ice on the sidewalk? Yep: that I'll buy into.
Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.
Talkback
That black ice can be a b1tch
And white ice is not?
Black ice is harder to see...
actually thin, clear ice, usually in small patches, which appears black on
blacktop roads and hence is very hard to see. White ice is easy to see,
and tends to be in larger patches, and therefore no where near as much
of a b1tch.
So, for all the PC police from warm climates out there, this is *not* a
racially motivated term. Color is a physical attribute. Get over it.
I obviously should not stop at the computer in the morning before
having coffee.
Sounds icist to me!
How do you randomly select data in SQL?
use any number of little scripting languages, but what's a
Windows admin to do? He need a DBA who knows archaic
SQL or a software engineer who can right an application to
do it.
Here is one way (line) how a Windows admin could do it
to text and use any number of little scripting
languages</i>
Well, so could a Windows admin. Ever heard of
PowerShell?
<i>but what's a Windows admin to do? He need a
DBA who knows archaic SQL or a software
engineer who can right an application to
do it</i>
Or he could just do
<code>Invoke-Sqlcmd -ser dbserver -dat dbname
"select * from sometable" | random -count 100 |
ConvertTo-Csv</code>
Yup. No need to dump to a file, just pipe it
directly to the <i>random</i> cmdlet which will
choose a number of random objects from the
stream. Finally the records are exported in csv
format. No archaic parsing.
What???
It's amazing to how you say that after telling us for so long that *nix doesn't stand a chance against windoze because it forces you to use the command line.
Did I say that?
If you heard this argument I believe it must
have been referring to the instances where you
*can not* accomplish a task in Linux/Unix
without resorting to the CLI and editing text
configuration files (in all sorts of
<i>strange</i> formats - from INI over the
Apache pseudo-xml to peculiar colon-delimited
passwords files).
In Windows you <i>can</i> perform <b>all</b>
administrative tasks through the GUI.
<i>Automating</i> those tasks may require
scripting. This is where PowerShell comes in.
How's that different?
type thing I would expect a *nix admin to do.
It's good to see that Windows 2008 Server is finally catching up with
Unix from the seventies, and that MS is even being gracious enough to
backport such innovations to earlier versions of Windows.
What's a fancy new tool and associated techniques in Windows land is
a decades old culture in Unix land. The task may be equally easy in
Windows, but it will be years before average Joe Windows admin gets
out of his GUI and uses them - if he ever does given the Windows
culture.
Not much
for many years. There as wsh with JScript and
VB scripting - but they lacked the useful
piping feature of *sh shells.
But PowerShell is more than just catching up.
It is waaaay more consistent and leverages the
object-oriented nature of Windows. Thanks to
that PowerShell accomplishes the same as bash
or zsh with far fewer and simpler commands.
That's because PS has 1-upped the *nix shells:
Object piping means that commands become
simpler (to implement as well as to learn) and
PS scripts become easier to write and read
because there is no need for strange parsing
and formatting to make ends meet (no need for
awk, sed, cut etc).
Sounds complicated
I take my statement back. MS hasn't managed to catch up to *nix from
the seventies. Thanks for saving me the time of looking a PowerShell.
After your first post of was seriously considering it.
Speaking of *nix culture and Windows culture
stuck in a 1970 mindset where me-us-everyone
security is sufficient, where the only
securable objects are the ones in the file
system, where every command overloads what -f
means and where every application *must* define
its own configuration file format.
Dismissing a tool by homing in on a single
phrase is sooooo mature. Way to go.
If you haven't noticed...
would make pretty much ever object securable.
My objection to OO doesn't have anything to do with Unix versus
Windows. Plenty of Unix bigot programmers use languages like C++,
Java, Python, Ruby, and even C# - all of which are object oriented to
varying degrees and according to varying definitions.
My objection is that languages that are OO has significantly worse
compositional properties and weaker type systems than functional
languages like the ML family and Haskell while lacking the flexibility
and simplicity of text streams across a pipe. That's not to say I think
OO is useless, especially in hybrid OO languages (Python, Ruby, Scala,
etc) where you aren't completely forced into the object paradigm.
For administrative scripting tasks a type system just gets in the way,
and most of the time so do things like classes, even if they are
dynamic. A prototype-based OO system like JavaScript's might
actually work well, but I've never tried it for such tasks.
I object! Take a look
most often used commands in *nix: ls and ps.
Now look closely at how many of those options
1) filter on some property not usually reported
in the listing
2) control output format so that both humans
and next-in-pipeline commands may understand
the output.
3) select related information, e.g. threads.
ps alone has some 60+ options!
Now consider the ps command in PowerShell:
<code>
Get-Process
-Name <string[]>
-ComputerName <string[]>
-FileVersionInfo [<SwitchParameter>]
-Id <Int32[]>
-InputObject <Process[]>
-Module [<SwitchParameter>]
</code>
That's it.
Now, is the *nix command more powerful? Nah -
most of what it does is about output formatting
<i>because some other command should be able to
parse</i> it. Some options control whether a
header should be printed. Other options control
what information goes into the output. Many
options control how that information is
displayed, e.g. dates, sizes, string
(delimited?) etc.
In PowerShell, the ps command output
<i>objects</i>. Another command can readily
consume them <i>without needing to sed, cut or
awk</i> anything. See, the object-orientation
lifts the formatting and output selection out
of the commands. They become simpler to
implement and simpler to understand. They do
"one thing only" - unlike the *nix beasts which
must find the information and <i>then</i>
format in text.
So answer a question for me...
effectively introducing a new class. For example, in the text processing
way, I might have a script that takes tabular data, removes some
columns, and adds some new ones in. Doing this is trivial using Unix
style scripts because it's just text. How do I do it in PowerShell?
Fair question.
an OO language would be effectively introducing
a new class. For example, in the text
processing way, I might have a script that
takes tabular data, removes some columns, and
adds some new ones in. Doing this is trivial
using Unix style scripts because it's just
text. How do I do it in PowerShell?</i>
First off you have to realize that you seldom
deal with raw text in PowerShell. Why? Because
the built-in cmdlets all return stream og
<i>objects</i> - not text. Hence I don't need
to (re)parse the output from ps or ls - I can
just work with the properties, methods and even
events directly from PS:
<code>ps|where{$_.WorkingSet -gt 100MB}</code>
<i>ps</i> returns a stream of Process objects.
Process happens to have a property called
WorkingSet which indicates how much memory it
uses. The <i>where</i> cmdlet filters using a
script block - <i>$_</i> represents the
iterator.
So the above command tells me which processes
uses more than 100MB. Note that the properties
are typed. 100MB is actually a number.
The above command still returns a stream of
Process objects. When they "fall off" the end
of the pipeline they are being rendered
according to a default format, unless I decide
otherwise. Key point is that the objects still
retain all their properties and methods. Unlike
the *nix ps command where the only information
forwarded is the text that you see.
I could do
<code>ps|where{$_.WorkingSet -gt
100MB}|fl</code>
which would list all processes in <i>list</i>
format. This will list each property on a new
line and an extra blank line between each
process.
<code>ps|where{$_.WorkingSet -gt 100MB}|ft
Name,WorkingSet,Threads</code>
which would list 3 columns, the "threads"
column with multiple values (thread IDs).
I could do
<code>ps|where{$_.WorkingSet -gt
100MB}|foreach{$_.WaitForExit()}</code>
Which would wait for all of the "big" process
to terminate before the script continued.
Keep in mind that "text" is just special
objects. In PS, text is streams of strings. If
I <i>really</i> wanted to process text PS has
ample tools for that: I can import/export
delimited files with and without headers, I can
split strings on delimiters into arrays, join
them again or drop down to regular expressions.
I'm not trying to dodge the question, but you
have to be more specific if I am to demonstrate
the PowerShell equivalences to awk, sed, cut
etc.
awwwww..
A for effort!
Agreed
Maybe MS should try a new slogan: "Making the past licensible again" or "Hide the origin" or "Facing Forward, Marching Backwards" ("Hypocrites is us"? )
And, of course, in reality, the line quoted far above is a CLI demo, not something that would work. One of the fun consequences of OO in this case is that the job explodes in complexity once you start scripting the sql command (multiple tables) and structuring the output line. - and note that you can't adopt either of the obvious solutions without complicating the command beyond redemption because you have to stratify and chose randomly within each stratum, not choose randomly from the output after stratify and select.) i.e. you'll be doing a foreach one way or the other.
Murph, please show me
answering.
Erik was claiming that this about a Windows admin: <i>"He need a DBA who knows
archaic SQL or a software engineer who can right an application to
do it."</i>
Since Erik obviously is challenged on SQL I decided to keep the SQL <b>really
simple</b>.
You are of course correct that if the
If the admin is allowed to know a little more SQL (or allowed to look up the
doc) he could do the following instead (separating out the sql for clarity):
<code>
$sql = "select * from table t1 tablesample(100 rows) inner join table2 t2 on
t1.key=t2.key"
Invoke-Sqlcmd -ser sbserver -dat dbname $sql | Export-Csv "sample.csv"
Send-MailMessage -t "murph@unixzealots.org" -sub "Sample data" `
-f "admin@windowszealots.com" `
-body "hi murph, this was too easy." `
-smtp smtp.windowszealots.com`
-att sample.csv
</code>
Three lines and you have your sample data. It should be obvious that the sql
scales to joins over many tables.
So, murph, how would you expect your Unix admins to go about this with your
favorite stack?
mail honeym <assoc_array
The point of this approach is neither elegance nor efficiency, but functional separation, testing, and replication. In fact this is an example of something I do all the time: using the dumbest approach simply because it makes it easy to guarantee that the result is right.
You can do exactly the same (in fact the perl would be portable) with Wintel, the point of the blog wasn't that this couldn't be done or even is hard to do, the point was the wintel people didn't want to go an extra step without lots of formality, hand wringing, and <EM>their assertion of control</em> over user decision making in the systems context - where the Unix people responded to the ph calls from their bosses by just getting the job done.
(In thinking about this, the most elegant and efficient way, assuming one had tens of thousands of records per district and hundreds of districts, would be to pipe all tables to a perl associative array and output the final file from that - not that I would actually do this: with today's gear even horrendous SQL queries often take only a few seconds - much less than checking that the array includes its the end points :) )