IP storage -- inspiration or insanity?

If Indiana Jones were alive today, he'd eschew dull old archaeology for the thrills of distributed terabyte storage. At least, that's what the manufacturers would have us believe

The world of storage area networks is no stranger to rip-roaring excitement -- and the most spine-tingling development of late, they say, is IP-based Storage Area Networks (SANs) where huge sets of hard disks, tape drives and other mass storage devices work in perfect harmony thanks to open internet protocols.

Of course, using IP to access storage is nothing new. The venerable Network Filing System (NFS) and Common Internet Filing System (CIFS) have been used to connect clients to file servers for many years. But that gives file-level access, whereas SANs need block-level access to data, in effect removing the server and leaving the storage device directly connected to the network. SANs also typically allow multiple redundant links between host and storage device, and provide transparent mirroring and backup. They should also provide a single, simple management point to multiple linked devices.

There's a lot of confusion over protocols and physical transport mechanisms in storage, due in part to the way SANs have evolved as a mix of storage and networking. Traditionally, if a hard disk was SCSI it meant that the disk had a fast SCSI parallel interface and responded to SCSI commands sent to it on that interface. The interface -- the physical transport mechanism -- and the command set -- the protocol -- were indivisible. These days, the two aspects are separated. The ANSI SCSI Architecture Model (SAM-2) defines the command and data transfer protocol as something that can be carried on any physical transport mechanism.

Fibre Channel (FC), currently the most mature and widely installed SAN standard, carries a serial version of SAM-2 on top of a very fast physical transport layer. The FC standard copes with error correction, address translation, framing and so on at gigabit speeds, at up to 10 kilometre distances. More than that, and non-FC techniques have to be used -- either putting FC signals onto more efficient fibre optic systems, or sending FC packets through an IP tunnel.

This system, called FC/IP, lets users set up what's in effect a virtual private network across an IP system, through which FC data flows as if through a transparent connection. That works for point-to-point systems, but the details of the SAN traffic is in effect invisible to the IP networks carrying it. This makes it hard to manage and prioritise with any degree of finesse, and prevents it from being routed between more than two points. For that, you need to put SAM-2 directly into IP -- a technology called Internet SCSI, or iSCSI.

iSCSI was created by IBM and Cisco. It's not the only IP SAN protocol -- there are others from Adaptec, Nishan, et al -- but it's been through most of the IETF submission process and is the most widely known and discussed. There are also a few iSCSI devices available, reflecting the relative maturity of the standard. The advantages to building a SAN out of IP packets mostly boil down to the very high level of experience, tools and comfort most corporates have with the IP protocol. These are not to be sniffed at -- the benefits of being able to deploy, configure and control your storage area network seamlessly with the rest of your IP networking will be significant. The disadvantages, however, are currently legion.

For a start, TCP -- the most common session protocol run over IP -- has much weaker error detection and correction than FC, which is statistically significant when dealing with very large data transfers. Streaming video doesn't cope well with TCP's habit of retransmitting blocks when an error's found. Sending huge amounts of corporate data over an IP network at very high speeds potentially reveals enormous numbers of company secrets, but encryption -- even for slow networks -- can require a lot of processor power.

Speed and processor power remain the biggest problems for iSCSI. The proponents of the standard implicitly accept that it is less efficient than FC by coupling it with 10 gigabit Ethernet -- a new and untested standard that will have to succeed if iSCSI is to happen in a big way. But TCP can consume between two and four MHz of CPU speed for each Mbps of network bandwidth, even without encryption, so at gigabit Ethernet and above it imposes unreasonable demands on servers. The cure for that is dedicated hardware, a TCP Offload Engine (TOE) that does all the network processing and delivers a finished message to server memory independent of the server's own processor. Like 10 gigabit Ethernet, this is cutting edge technology that's not widely available or trusted yet.

Only when all of the above have been out in the real world for long enough to establish themselves as real, reliable and interoperable systems will iSCSI be a serious competitor to Fibre Channel. Before that point, however, it may well be a useful complementary technology, linking islands of FC SAN across IP clouds or creating smaller, slower but more flexible SANs from existing Ethernet transport protocols. Basing today's SANs on what today's technology can demonstrably do, without writing off the potential of what's under development, is the sanest approach.

Have your say instantly in the Tech Update forum.

Find out what's where in the new Tech Update with our Guided Tour.

Let the editors know what you think in the Mailroom.