An iSCSI primer

Scott Lowe goes over some of the basics of an iSCSI storage implementation.

I've written a lot about iSCSI in these storage columns, and I believe it to have a very bright future, especially with 10-Gb Ethernet on the way and the significant difference in acquisition costs between a complete iSCSI installation and a similar fibre channel setup.

In general, storage technologies have tended to be complex and somewhat misunderstood. iSCSI, while simpler, still retains some of the complexity and adds some new concepts to which you will need to become accustomed if you implement it. In this article, I'll provide some information about iSCSI for those of you that may be considering this quickly-growing technology.

Some people consider iSCSI to be the melding of a dedicated storage area network (SAN), like fibre channel, with network attached storage (NAS). This is probably because iSCSI uses Ethernet as its transport mechanism. That said, on a scale with NAS at one side and SAN at the other, iSCSI is definitely much closer to the SAN side of the scale. Whereas NAS devices work at the file level--that is, entire files are transferred to and from the device--iSCSI works at the block level, exactly like a locally connected disk. This means that iSCSI can be used in situations inappropriate for NAS, such as for some database applications and Exchange. (Although Exchange can use NAS file-level devices, block-level storage is definitely the preferred mechanism.) When you attach a server to an iSCSI array--to the server--the storage looks just like a local disk, thus providing a seamless storage experience.

Many iSCSI arrays offer the following features:

Multiple path capability: Like other storage technologies, iSCSI offers the capability for multiple data paths to provide redundancy and greater throughput. Instead of a single-point-of-failure, gigabit Ethernet connection, you can install multiple gigabit Ethernet adapters in your servers, and provide a fairly inexpensive, fully-meshed storage architecture that, when everything is up and running, also offers aggregated bandwidth to the iSCSI target for improved performance.

Ethernet jumbo frame support: This isn't really an iSCSI technology, but it does make iSCSI perform better. This larger frame size reduces the overhead on both your servers and iSCSI targets. Jumbo frames are generally 9K in size, but some NICs and switches support 16K frames as well.

Snapshots: Generally available for an additional (and sometimes huge) cost on fibre channel gear, snapshots are the primary reason that many companies opt for a centralized storage architecture. In most iSCSI equipment, the snapshot feature is included in the base price of the product. Snapshots provide significant data protection in that they can protect your data between backups. Without snapshot capability, many companies have a "window of risk" of 24 hours or more, meaning that, between backups, data loss can occur. For example, if you have database corruption in your ERP system at 5 P.M. and you restore from the previous evening's backup, you could lose 17 hours of data, or more. Using snapshots, if your database blew up at 5 P.M., you could remount a snapshot from 4 P.M. and, while there would still be some data loss, it would be much more limited. Snapshots significantly reduce your window of risk.

Replication: Again, this feature is also available on fibre channel SANs--often at a hefty cost--but is usually bundled with iSCSI SANs. Replication is an important disaster recovery element that can automatically copy data from your primary data center to another similar SAN array in a backup data center. With iSCSI's comparatively inexpensive price tag, true hot-site disaster recovery becomes a real possibility for small- and medium-size businesses. Both synchronous and asynchronous replication are generally supported. Synchronous replication is considered "real-time" and is great for faster links, while asynchronous replication is more suitable for remote offices, as it is usually scheduled.

Scalability: Enterprise-class iSCSI SAN arrays can scale in both storage and performance by adding additional units to the storage cluster. For example, in my data center, we have a single EqualLogic PS200E array with multiple gigabit Ethernet connections. When I eventually add a unit to the SAN cluster (even a single unit goes into a cluster), the two arrays will automatically restripe all data across both units, and the available bandwidth to the overall cluster will double since I will be doubling the number of gigabit Ethernet connections.

Ethernet: This is the most simplistic part of iSCSI. It runs on Ethernet. The chances are pretty good that you and your coworkers are intimately familiar with Ethernet--both with its good points and its bad points. With iSCSI, you don't need to learn yet another transport mechanism. You can rely on inexpensive, commodity hardware and can get your storage network running quickly and easily.

iSCSI is beginning to make inroads into companies that may not have considered this technology before. With its growing popularity, it's important for IT people to understand some of the features and benefits.