A business problem

By | January 9, 2010, 12:00am PST

Summary: I’m in an unusual position here: I know I’m right, I just have no idea why - I know, almost as many “I”s as you find in a Obama speech, but this is an interesting problem and one a lot of IT and business managers confront every day: if you don’t have quantitative data, how can you know you’re right?

I have a problem - not that one :). This one involves that rarity for me: self-doubt about a technical recommendation juggling system ownership issues versus out-sourcing and customer co-locates.

The whole thing started as a simple question about a hardware refresh for an application they’ve had running on a little AMD/Linux grid for about five years now. What the application does is take in client files measured in the hundreds of gigabytes (often terabytes) to produce a relatively small number of large images measuring in the four to eight gigabyte range.

Twenty years ago tapes came by truck, that became boxes of cassettes, then Fedex packs stuffed with optical platters and later DVDs, and now some customers want to do everything instantly over the internet.

Unfortunately the company is in Canada and network costs are well over American expectations: the total cost for the two 100Mbps ports they currently maintain on their local metropolitan area network runs to about $6,000 a month - and those ports are about 65% idle with average monthly volume running just below 6TB between them. Note too that actual performance on links crossing peering boundaries on this piece of backbone often drops by an order of magnitude, so customer delivery of a streamed, 2GB, image zip can take forty minutes to an hour.

I’ve suggested three scenarios to them:

  • convert their software to cell, (specifically to Linux mini-supers made up from playstation boards), and license whole racks as appliances to their customers for operation by customers on customer premises.

  • co-locate the grid with, and effectively outsource data center operations to, a high bandwidth data center operator in Omaha.
  • do the traditional thing: upgrade the grid, upgrade to Gbit ethernet ports, and charge the customer who wants internet turn-around a bandwidth surcharge.

I like the first option best: go appliance computing!. Nothing wrong with this picture: it’s positive for the company and its customers - and the software, written in C and F77 for BSD4.3 on Vax but targeted to an Elxsi 6400 and since converted first to SunOS and then to SuSe/MP, is easy to port (but hard to optimize) for cell.

So what’s wrong with this? Well, it’s wholesale business change and even testing it properly requires putting other options on hold for at least six months, investing a bunch of expensive manpower, and risking the business relationship with a couple of important customers.

Option two, my second choice, is purely cost driven. Putting new machines into the Omaha center would cut processing time by an average of about four relative to the existing system, provide about eight times the burstable bandwidth for customers, and end up costing the company less than half what it pays out for data center operations now.

Equally importantly, the impact on customers is muted: internet customers don’t typically care where the service is, and courier customers mostly wouldn’t care because they tend to ship data from field offices, not head offices.

Unfortunately for my line of thinking here the business founder (and still majority owner) won’t consider this - and the idea of moving processing without moving the company and its people strikes me as a plan to end the business badly: with some customer’s confidential data falling into hands that aren’t supposed to get it.

Option three is popular with some of the key players, but is a bet on the future being the past - and how often has that happened in anything IT related?

To complicate matters, the company has one competitor offering much the same service for roughly the same money (not how they see it, of course :) ), everybody’s tight for cash, and volumes went through the floor last year when the boom died - so even casting internet transfers as an optional new, and premium, service has its risks.

What’s needed for this decision is a nice, clear, way to quantify the relative risks of each approach - and that’s where I’m stumped. I can argue a strong case for either the pro or con on each of these three options - but risk numbers I haven’t got; and risk numbers are what I need.

Ever see a cartoon in which the hero flies off a cliff or a building and floats serenely in the air before discovering the lack of support and plummeting down? That’s me on this issue: serenely confident in the opinion that business change is the way to go, but entirely without a leg to stand on. Duh, anybody got any ideas? Real data on leaks from out-sourced versus hugged systems that take system complexity and change into account? How about real data on changing the business model to keep up with IT changes? How about just real data on the speed and reliability of large file exchange across the internet?

Kick off your day with ZDNet's daily e-mail newsletter. It's the freshest tech news and opinion, served hot. Get it.

Topics

Paul Murphy (a pseudonym) is an IT consultant specializing in Unix and related technologies.

Disclosure

Paul Murphy

I do not work for, or otherwise receive anything from, any of the companies I write about. I have some money in a number of funds that bet on the markets, including the technology market, but have no direct control over how these funds are administered or what investments are made. I use Sun and Apple technology both at home and at work.

Biography

Paul Murphy

Originally a Math/Physics graduate who couldn't cut it in his own field, Paul Murphy (a pseudonym) became an IT consultant specializing in Unix and related technologies after a stint working for a DARPA contractor programming in Fortran and APL. Since then he's worked in both systems management and consulting for a range of employers including KPMG, the government of Alberta, and his own firm. In those roles he's "been there and done that" for just about every aspect of systems management and operation.

Talkback Most Recent of 38 Talkback(s)

  • Colocate vs AWS
    In terms of colocate, why would not AWS be an option?

    Are you trying to 'contain' custom grid apps and move them horizontally to a new location?

    AWS has their Elastic Map Reduce (Hadoop implementation) for 'on demand' use, but you *should* be able to preserve your proprietary apps in a move to AWS.

    Remember, AWS is private cloud capable, meaning the subnets you create are able to be isolated from the public cloud, if you so choose.

    Also, if the nature of this process is one where it isn't continuous and inactivity periods are clearly delineated, then you can simply 'turn off the lights' and close shop (the meter doesn't run when its not in use) during those known intervals.

    So with AWS, you only pay for the resources that you consume.

    Here is a handy AWS Monthly Calculator

    In the interest of full disclosure, I am an Amazon Web Service Provider.

    Thank you Murph and Happy New Year

    Dietrich T. Schmitz
    ZDNet Gravatar
    D T Schmitz
    9th Jan 2010
  • AWS, Sun, etc
    I glanced at several cloud style options, including AWS. My problem with these is two fold:

    1 - getting the files to and from them is a much bigger deal than you'd think. Use physical transport and you get additional security and control issues, use the net and costs skyrocket.

    2 - I did not find anybody's security promises credible.

    Basically I do not believe that using servers shared with others and run by people you don't know and can't get to ( for example none of the 3 I contacted would/could tell me how many people had physical and/or electronic access to the hardware, particularly disk. All 3 nattered about how great their controls are instead ) makes sense where data is extremely valuable and leaks, especially if uncaught, could destroy the company.
    ZDNet Gravatar
    murph_z
    9th Jan 2010
  • 2) Full disk encryption, ACLs???
    Please expand on #1 if you would Murph.
    ZDNet Gravatar
    D T Schmitz
    9th Jan 2010
  • None of this is good enough - and
    the killer is that source files are not encrypted.

    It's not easy to see how to fix that either because the paranoid assumption that the bad guys have access to everything at one end of the pipeline pretty much defeats all practical solutions.

    as far as I know... ?
    ZDNet Gravatar
    murph_z
    9th Jan 2010
  • I am not following you
    Your answers are brief and I suspect you are not completely confident in what you perceive to be the case.

    The significant cost-reduction potential alone should be incentive to dig deeper into replacing colocated services with AWS' Virtual Private Cloud.

    Whole disk encryption and ACLs keep any unauthorized access from occurring.

    As you know, security is a process, and it applies to 'everywhere'.

    Perhaps, if you 'do the math' with the Calculator I provided, you can arrive at a ballpark figure for a pilot project's one-month expense.

    Otherwise, I cannot help but feel that there is something which you have as of yet not touched on which presents an issue for you.
    ZDNet Gravatar
    D T Schmitz
    10th Jan 2010
  • Not confident
    You say: "Your answers are brief and I suspect you are not completely confident in what you perceive to be the case."

    Kinda what this blog entry is about, right?

    ---
    but if you don't grok why data on dvds is at risk.. umm: somebody has to receive the disks and load the data (sending it via the net is too slow/expensive /*TB!*/). That requires local, physical, access. Getting the data to the client's office, having them recode it and then send an encyrpted disk is too slow and adds both cost and courier risk - so how can the in/out steps be
    secured?

    My paranoia is that you gotta trust somebody, somewhere -and trusting a few people you hire and see every day is a lot easier than trusting a cloud of people you don't know, don't see, and don't directly interact with.
    ZDNet Gravatar
    murph_z
    10th Jan 2010
  • Not an issue
    Take a look at AWS Import/Export and
    AWS Import/Export Calculator

    Have you evaluated volume encryption methods?

    TrueCrypt can take your data as a 'file' to its destination, or an entire whole disk encrypted device, which isn't of any use to anyone other than the keyholder, which means that drive can be mailed, mounted but only unlocked by you or your designee remotely over ssh.

    That's a one-time affair to backload the existing database.

    Ongoing all activities would be taken remotely through your console on a whole encrypted series of mount points.

    No one at AWS could or would determine what you are doing.

    I don't see an issue. Have you taken your meds today? (kidding...kidding)

    Tell us Murph, what is really on your mind? wink
    (Or, feel free to contact me off-line--see my contact page).
    ZDNet Gravatar
    D T Schmitz
    10th Jan 2010
  • Encryption requires keys
    And who has access to the keys? SOMEWHERE in the chain of encryption is the point where you need to apply the private key. Where is it? And since processing requires non-encrypted data, what about temp space or even reading straight from memory?
    ZDNet Gravatar
    Roger Ramjet
    12th Jan 2010
  • @Roger: Straw Man Argument
    Let's say your theoretical company had more than one Data Center. And you chose for economy to have a hard drive mailed from your Data Center A to B where an import/export facility handles back loading of data.

    Let's say that theoretical employee or even 'evil doer' outsider has found his way into the AWS physical site and plans to hatch a scheme which will look into your virtual machine's temp and virtual memory looking for human readable information.

    The likelihood of your scenario happening at your Data Center A or B is higher because the user can focus his work on a 'known' dedicated physical target machine.

    But the likelihood of a user knowing which AMI you have chosen at AWS, much less, 'where' in your distributed cluster to look is so improbable that it makes the mere suggestion that it could occur sound, well, silly.

    Your scenario is a straw man argument conveniently thrown up for something that realistically cannot happen.

    So, I maintain AWS is as safe, if not safer, than any colocate or business-owned Data Center.

    As such, I see no issue with my recommendation to Murph to employ AWS for his business problem.

    Dietrich T. Schmitz
    Amazon Web Service Provider
    ZDNet Gravatar
    D T Schmitz
    12th Jan 2010
  • Once again - keys
    In order to DEcrypt data, you need a private key. This private key MUST be clear text - or it won't work. This clear text private key MUST be located SOMEWHERE. In fact, some human must input it. This is the Achilles Heel of the system.
    ZDNet Gravatar
    Roger Ramjet
    13th Jan 2010
  • @Roger makes a very valid argument
    I'm surprised you're dismissing it. It's not a straw man. It's remedial
    cryptography.
    ZDNet Gravatar
    Erik Engbrecht
    14th Jan 2010
  • Too early for figures
    The cloud is a lottery. There can be no data security in such a model beyond token marketing assurances.

    Outsourced server management, typically in co-location facilities with many customers, on servers hosting many customers (no identification beyond a credit card), where your information can be relocated at any times (potentially across national boarders) raises serious data security challenges.

    Evaluating risk is difficult, it's too early for an accurate picture to emerge from real world studies.

    Managing large data sets is much better understood. Look to the media solution providers for answers (moving far larger volumes than Murph indicated in his post). These solutions scale very well, bandwidth costs the only issue (but accurately definable).
    ZDNet Gravatar
    Richard Flude
    10th Jan 2010
  • NY State Lottery Slogan: You have to be in it to win it.
    Seriously,

    What is your level of experience with AWS Richard?

    AWS does a fairly decent job of providing the tools for determining which Amazon EC2 Instance Type is a best match for a given application.

    All one need do is test their app on a given instance type over a period of time to determine if it is sufficient to support their needs. The 'on-demand' utilitarian low-cost nature of AWS allows one to test an instance and find over a short period of time an answer to their usage case.

    The EC2 Instance Type accurately describes the minimum virtual processor performance one can expect to have.

    Bear in mind, S3, is a shared resource pool so there might be a degree of variability in I/O to consider.

    Otherwise, to suggest that AWS is a 'Lottery' is really simply not true.

    So Murph, I don't see based on your limited, guarded comments why AWS would not be a good solution for your 'business problem'.

    Murph, what is your 'hidden agenda'?
    You are fishing for something.

    Thank you.
    Dietrich T. Schmitz
    Amazon Web Services Provider
    ZDNet Gravatar
    D T Schmitz
    11th Jan 2010
  • It's a tad rude
    to refer to Murph while responding to Richard's post. You did a great job of confirming what Richard said. Somewhere the data will not be encrypted - and there is the weak spot.
    ZDNet Gravatar
    Roger Ramjet
    12th Jan 2010
  • ZDNet Gravatar
    D T Schmitz
    12th Jan 2010

Talkback - Tell Us What You Think

Formatting +
BB Codes - Note: HTML is not supported in forums
  • [b] Bold [/b]
  • [i] Italic [/i]
  • [u] Underline [/u]
  • [s] Strikethrough [/s]
  • [q] "Quote" [/q]
  • [ol][*] 1. Ordered List [/ol]
  • [ul][*] · Unordered List [/ul]
  • [pre] Preformat [/pre]
  • [quote] "Blockquote" [/quote]

The best of ZDNet, delivered

ZDNet Newsletters

Get the best of ZDNet delivered straight to your inbox

Facebook Activity

White Papers, Webcasts, & Resources