Storage networks
Summary: Storage area networks sprang up largely in response to the complexities of managing storage in a Microsoft client-server world - itself ultimately a response to the small machine limitations inherent in both Windows and x86. Today, however, those limitations are receding for most small to mid range businesses and the technologies they spawned should, logically, go out with them.
Back in 1977 Dennis Fairclough had a good idea: build something that would allow multiple small computers to share a single disk drive and in 1979, after his investors forced him to hire Ray Noorda to run the business, this became the first Novell product to reach industry standard status.
Today's replacement for that solution to the costs of sharing and managing storage is called a SAN -and its time is past because we're all going back to Novell's original implementation of the shared storage server idea.
The specifics of the SAN solution evolved in response to three main sources of pressure
- the Microsoft rack mount approach to SMP necessitated a lot of server to server networking - and no single x86 machine had the power to provide data services to all;
- the typical data center found itself responsible for backing up, storing, and recovering data from hundreds to thousands of desktop PCs - each storing multiple gigabytes of mostly duplicated material; and,
- legal, audit, and cost pressures aligned with IT management desire for more centralized control to make data centralization the only naturally acceptable solution.
As a result the idea evolved that you could set up a separate server network dedicated to acting as a virtual disk drive and then use software running on the machines in that network to manage your data more or less in parallel as a way of getting past the small machine limitation.
The key to making this work, of course, lies in interconnecting the storage rackmounts to facilitate both communication between the servers "parallelizing" data storage and communications with the external layer in the local network hierarchy -and out of that we got separate fiber channel (and now FC over 10GB ethernet) networks for storage management and the whole business of "fabric" switches ranging from Brocade's 256 port "Intrepid" at $4.5 million down to things like IBM TotalStorage SAN32M-2 Express Model at about $11,000 for 16 ports.
Looked at objectively SAN technologies are very advanced and reasonably effective - but also completely unnecessary because they all fundamentally respond to a small machine limit that really only exists in the wintel world.
In the mid nineties, of course, nobody wanted to use big machine solutions: the idea of using a two million dollar, four processor, DEC Alpha running OSF/1 to connect PC networks to terabytes of centralized data storage just wasn't a big seller to NT true believers. Instead, therefore, the industry choose to expand the client-server paradigm to evolve fiber channel networks to do the same thing using from dozens to hundreds of NT machines.
Today, however, that Alpha can be more than replaced by machines like Sun's 74XX storage servers that fit in a 4u rackspace and don't incur the costs, risks, and commitment to expertise needed to make a bunch of small machines approach storage parallelism.
In effect the new machines replicate the simplicity of Novell's original approach -itself based on using a 16bit MC6800 processor to serve data to multiple 8bit machines built on z80s and 8080s - and therefore allow the same usage options including physical distribution in the enterprise, automated backup and recovery, and simple storage connectivity.
Historically, of course, better technologies are rejected by the majority of IT decision makers, so what makes this different? The arrival of 10Gb ethernet as the next generation SAN standard is already forcing consolidation in the SAN vendor community -and because that will force some customers to change, the survivors are likely to be those who choose to pre-empt customer choices between traditional SAN costs and complexities on the one hand and the simplicity of Sun's low cost approach on the other, by trying to compete with Sun and thus drag the whole industry forward to where it was going in 1979.
Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.
Talkback
Back me up
This never happened. Not because it wasn't important - but because the design of SANs made it difficult (read impossible). Whereas a simple LAN can accommodate NFS drives (file-system access), there was no solution for block-level access. This was needed (supposedly) for large databases. So in order to provide this block-level "drive", the World Wide Number was born.
It is the limitation of the WWN that prevents SAN backups. A WWN from a drive can only be assigned to a specific HBA (or 2). In order to back up a SAN drive, you would need to add the backup server HBA WWN to ALL SAN drives. Then you would have to handle things like shared access and database quiescence. After all of that, you would still have to map the nameless drives into restoring a certain file for a certain server. That's a lot of crap to go through just to say you can back up SAN from the SAN.
So what happens is that each and every Virtual Machine needs its OWN copy of backup software, and the backups still go over the LAN - as they have since the beginning of time . . .
So I guess that there IS a cost for creating a VM. So what was that about taking the "load" off of the network? What a bunch of hooey.
Most people I know
Until the data center burns down
As for that new 10Gb LAN - that is the revenge of Cisco (for losing out on the SAN gold rush earlier).
Cisco
You do know that Cisco has ~45% of the SAN switch market right?
<opinion>
10gb LAN will no doubt have impacts on the storage side of any enterprise. How much remains to be seen. It will probably be minimal because to get anything close to 10gig performance you will need TOE cards, and there just isn't the platform availability for TOE cards as there is for fibre HBA. And then at that point you are spending just as much on Ethernet gear as you are fibre gear. We aren't going to see the massive commoditization of 10gigE as we did 1gigE because there will be no drive to deploy 10gigE to the desktop. 802.11N (and future protocols) will be the deathblow for premise wiring, removing any economies of scale that might benefit 10gigE
</opinion>
Mickey-Mouse Data (McData)
They were late to the game, or they would have a much bigger slice. They let nobodies like Brocade and McData beat them to the punch. They did come on strong later.
Cisco also tried to push iSCSI - with little success. Today, iSCSI looks useless/redundant whereas FcLAN will be a game changer. The death knell of a separate network ONLY for storage is coming.
802.11N and future protocols will not replace premise wiring (entirely). Wireless is just too insecure - and adding encryption only slows it down (and makes administration tougher).
Still wrong
Your knowledge of backup architecture in a SAN environment seems to be limited at best. Hint: There are Fibre Channel tape drives for a reason. Also, SANs inherently don't remove the need for backup software. These applications are still very much needed.
Another misconception is that you seem to think that a SAN would be backed up as single discrete unit. A SAN is never 'backed up' You back up the hosts attached to the SAN.
Maybe Wong but not Wrong
What do you feed them? Can you back up a single "meta" from the EMC storage array? How does it get mounted for the tape drive to back it up (sans server)?
[Also, SANs inherently don't remove the need for backup software. These applications are still very much needed.]
It was the promise of moving backup traffic off of the LAN and onto the SAN that was appealing. As for backup software, paying for a license for each VM make it an expensive proposition - especially if you could do it all at the storage array level . . .
[Another misconception is that you seem to think that a SAN would be backed up as single discrete unit. A SAN is never 'backed up' You back up the hosts attached to the SAN.]
SAN = storage array (not area) network. The storage is carved into "metas" and presented as a LUN. Each LUN (usually) needs to be backed up - somewhere. Which is better:
1). Every LUN gets backed up straight from the storage array to tape.
2). Every server instance (VM) needs a (purchased) software to back up the LUNs OVER the LAN to backup servers (tape or tape+disk). There is also a cost for maintaining the backup environment.
Why would anyone go for #2?
"Serverless" Backup Concepts
The host already has it mounted. You will have a 'dedicated storage node software' in EMC-speak loaded on the host. The backup server directs the backup client on the host to back up its data via the fibre tape drives which are zoned to the host. Voila, no data traveling over the LAN. Here is the cool part -- those fibre tape drives? They don't have to be tape drives at all! They could be a VTL, or a Deduplication appliance that is simply emulating fibre tape drives.
[i]It was the promise of moving backup traffic off of the LAN and onto the SAN that was appealing. As for backup software, paying for a license for each VM make it an expensive proposition - especially if you could do it all at the storage array level . . .[/i]
In the grand scheme of things most backup vendors simple filesystem clients are pretty cheap and are usually sold in at least 5-packs. In a VMWare world there are ways of creating crash-consistant images that allow for file level restore using VMWare Consolidated Backup. VCB require some network traffic but is much nicer on the LAN than trying to it on a VM - by VM basis.
I'm confused at your beef a little bit. You seem to be saying that if you own a SAN you should be able to totally rid yourself of backup infrastructure. It doesn't work like that. Backup over fabric is but one piece of what storage networking provides.
Broken Promises
Proprietary? Extra Cost? Why does the server have to do ANYTHING?
The idea of SAN was to remove the disk administration task from the server side. No volumeing, partitioning, etc. was needed anymore - you just get "presented" with a LUN and you mount it. No muss, no fuss. Sysadmins didn't have to deal with VERITAS-type software on JBODS (you just create an entirely new job classification called storage admin).
Now, if all of the storage tasks are supposed to be "taken care of", then why should you worry about backups? Shouldn't that be done in the background - since it is a storage "task"? As a sysadmin, just give me the LUN and I trust that you are backing things up.
That was the promise of SAN. Paying for all of that incremental infrastructure (separate network, a new department filled with new storage admins) needed justification - MORE than just "We moved the disks from there - over to there". The promise of keeping backup processing off of the LAN (and the server) was key - as networks were overwhelmed with the amount of data needing backup. Consolidate the backup with the storage - and there's savings!
Without the promise what you have is this - I spent millions and created entirely new departments and job classifications (and these guys cost big bucks) - just so I could move disks from the server to that storage array over there.
It is not as hard as you claim it to be.
Not hard, just stupid
My points:
1. The SAN should support backup sans server.
2. Backup software license costs increase linearly with each new VM.
The real world does not agree with you.
Due to that the data was being actively used (in the federal example), it was stored on a SAN and accessed by several researchers. While this can be done via NFS (or SMB or DFS) exports, backing up each node in the network, this is not practical (workstations can crash or be turned off, SAN only goes down when a SA takes it down unless there is no redundant power during a prolonged blackout). All data was exported (or mounted) from centralized servers with SAN switched RAID devices containing the studies.
There are multiple ways for this to be managed, though. One could use a filer (thus having a device that supports SAN and NAS protocols) but this is not always done for financial reasons (or knowledge of both protocols, administrative experience, etc.)
As per backing up the VM, VM's reside on disks and can be backed up as either VM contents only or the whole VM. It all depends on what you are backing up and the costs of backing up the VM v. the data contained in the VM. Some VM systems do support taking snapshots of the VM, thus you only have to backup the snapshots.
Backing up data over a network on off-hours is common, and backups are the responsibility of the local sysadmins typically. Saying that one should not have to just illustrates a lack of experience. backups and mirrors are run by cron jobs (or the Windows equivalent), the sysadmin only does verification (and rerunning a backup when it fails).
The Emperor has no clothes
When I talk about VMs, I mean that each VM attaches to a LUN from the SAN - and that LUN must be backed up. I'm not referring to the actual VM "image".
[Backing up data over a network on off-hours is common, and backups are the responsibility of the local sysadmins typically. Saying that one should not have to just illustrates a lack of experience. backups and mirrors are run by cron jobs (or the Windows equivalent), the sysadmin only does verification (and rerunning a backup when it fails).]
It's been done like that forever (I did that myself). So why did we buy a big expensive SAN in the first place? Answer: Because everyone else did.
Having a SAN or not does not provide jobs.
As per why did one get a SAN (or NAS or Filer), it is because (often) there is shared datasets or information/resources that are centrally managed. I understand from your posts here (and your blog) that you have limited experience and/or skills, but unless the organization is of rather significant size or specific product knowledge is needed quickly, this is the domain of a systems administrator.
RE: Storage networks
tx
Avoiding SANs
I really prefer NFS. Anyone who knows Unix, knows NFS. You can train a monkey to manage NFS systems. Try that with SANs/FC.
True - but don't monkeys prefer SANS?
Sorry - couldn't stop myself..
:)
So you prefer SANS? nt
Needs
When you say 'NFS' you really are saying 'NAS' which is somewhat of a holy war in storage managment.
Really NAS will work fine for many shops. The downsides from my personal perspective would be reliability and throughput, as compared to FC-SAN brethren. Feature wise, it all depends on whats hosting your NAS. With high end vendors like Netapp you'll get much of the functionality that a SAN would get you.
I do take a little issue with the point that NAS is 'less complicated' than SAN. Larger NAS implementations can get complicated fast. Really I think the issue is that NAS protocols/components are just more familiar to most people.
Again there is no silver bullet, its all about needs.
NAS