Data retention strategies for SMEs

Backup and archiving may not head the IT agendas of many small businesses, but for reasons of business continuity and regulatory compliance, the prudent SME will be formulating a data retention strategy. We examine the options.

Backup and archiving are activites that many small and medium-sized enterprises (SMEs) often either don’t think about, or are ill-equipped to handle. However, legislation regarding the storage of email and digital business records is changing, and this may result in many businesses facing unexpected penalties if they do not have a comprehensive data retention strategy. In companies lacking an IT department, or even an office manager and central servers, workers are often reluctant to organise their data so that it can be easily backed up, and are reluctant to spend any time actually performing backups. This can be due to lack of training and to data backup not being specified as a part of their job. Backup is often regarded as extra work that isn’t essential to everyday job performance.

Current backup solutions for SMEs are add-ons that are relatively difficult to use and difficult to integrate into an office workflow. They consist of backup software, either installed on individual PCs or on a server that must be purchased with a suitable number of licences to cover the number of users, and backup hardware and its associated media. End users might find backup a lot easier if the required software and hardware were sold integrated into business PCs and their operating systems.

It’s particularly ironic that, today, the cheapest, fastest and most convenient way to back up the contents of a hard disk is to use another hard disk.

Backup system design considerations
The design considerations for backup systems are complex and demanding. They include media cost, media capacity, media longevity and durability, compatibility, security, re-usability and disposability, backup and restore speed, ease-of-use and ease of access to legacy data.

Longevity is a major problem with systems used for long-term archiving, not only in terms of media storage life, but perhaps most critically in the turnover time of the software and hardware used to create and read the backups. Having media with a storage life in excess of 100 years is pretty meaningless if, after only five years, the software and hardware required to read it is no longer in use. Clearly, longevity does not pose as big a problem for short-term backup.

Almost all backup systems provide security via passwords and data encryption, although this is usually an option and may not always be implemented by the user.


Backup technology comparisons

The cost per gigabyte (GB), maximum capacity and typical transfer rates for tape, optical (CD/DVD) and hard disk backup are compared in the charts on the following three pages. These charts are based on currently representative values. Prices vary from vendor to vendor and tend to fall for newer technologies — and eventually may even rise for technologies at the end of their life cycle. Capacities for hard disks, in particular, are likely to rise in the near future, so the cost per GB should fall. Transfer rates are useful for a comparison of different backup technologies, but in practice backup times will vary and backup rates are generally lower than the maximum specified transfer rates.

Our analysis shows that, at present, bulk-bought DVD-Rs are the most cost-efficient method of storing data, although they do need a suitable storage container, which adds to the cost. DVD drives and recording software are also very competitively priced and most backup software for the SME market supports writing to DVD.

Although CD-R discs are cheap on an individual basis, their moderate storage capacity makes for a relatively high cost per GB stored.

High-capacity Digital Linear Tapes (DLT) provide a low cost per GB stored, but require a heavy up-front investment in expensive tape drives. Bizarrely, DLT 3 appears to be one the most expensive ways to store data, while SDLT II one of the cheapest.

Although, like magnetic tape, hard disks are based on mature technology, regular innovation means that hard disks continue to provide ever higher capacities and faster transfer rates at lower cost.


Bulk-bought DVD-R media currently offer the most cost-efficient method of storing data.



Super DLT II tape is currently the most capacious backup medium, but SATA hard disks are catching up fast.

At 150MB/s, SATA hard disks lead the field by some distance when it comes to data transfer rate.

Magnetic tape

Magnetic tape has been in use as a storage and backup media for over fifty years and 3M introduced perhaps the earliest cartridge or cassette system, the QIC (Quarter Inch Cartridge) format tape cassette, back in 1972. Nevertheless, tape backup system manufacturers maintain that there's still life left in this venerable backup method. The DAT (Digital Audio Tape) Manufacturers Group roadmap, published in 2004, projected further development on DAT-based DDS (Digital Data Storage) tape formats into 2009. However, although tape drives and media for formats such as Exabyte, Travan and DDS/DAT can still be purchased, it does seem that this technology is in decline. Major manufacturing of analogue audio tape has ceased, and it may not be too long before the same can be said of all forms of recording tape.

On the other hand, the latest tape formats, Super Digital Linear Tape II for example, are managing to keep up with hard disk capacities. SDLT II offers up to 600GB compressed (300GB uncompressed) on a single cartridge, at competitive cost per GB of data stored. For small businesses, the drawback with SDLT II technology and others like it, is the high cost of the tape transports. An HP StorageWorks SDLT 600 Super DLT II tape drive, for example, currently costs approximately £2,500 (ex. VAT).

The major advantage of tape was its affordability, and in the early days of computing, apart from punched paper tape or cards, it was the only feasible backup medium. In its latest forms it offers high capacity, is easily scaled and cassettes can easily be loaded in and out of drives, with some systems even using automatic loading and unloading. On the minus side, tape is fragile; it can jam in drives and is notorious for the recording layer falling off the plastic backing over time. Although the archive life of, for example, a Quantum DLT 8000 tape is specified as 30 years, this is under tightly controlled and closely specified environmental conditions. Tape is also relatively slow and because it's a linear recording medium, is inefficient to search. The huge number of physical cassette and recording formats make compatibility and long-term readability a problem.


CD, DVD, HD-DVD and Blu-ray optical discs

Although this cannot be said of all optical disc formats, CD and DVD recordable formats have, so far, fared very well for compatibility and longevity. As new drive designs develop to accommodate new higher-density formats (Blu-ray and HD-DVD being the latest) they have retained compatibility with the very first CD formats launched in 1982. It’s quite impressive that you can take a 25-year-old CD, load it into the latest Blu-ray drive mounted on a PC running Windows Vista, and it will still play. This is not true of any other digital media.

CD and DVD recordable discs provide a range of capacities per disc from 0.7 to 50GB (TDK even has 200GB Blu-ray media in prototype). With Blu-ray and HD-DVD technologies becoming available, data capacity per disc is approaching a reasonable fraction of current hard disk capacity. Media and drive costs are low and with random access and reasonably high transfer rates, data retrieval is easy. Even in a jewel box or Amary case, CDs and DVDs are small, lightweight and easy to store.

From a security viewpoint, the convenience and widespread availability of CD and DVD writers makes it extremely easy for individuals to make and read copies of sensitive data.


Backup to hard disk, or D2D

Until recently, perhaps the biggest drawback of backing up or archiving to hard disk (known to storage specialists as D2D, for 'Disk-to-Disk') has been the inconvenience of physically connecting and disconnecting drives. However, both the SATA drive interface and the ready availability of external housings with FireWire or USB 2.0 connections have solved this problem. Backups can easily be made to a temporarily connected external drive, or a SATA cage can be fitted to critical systems and servers so that drives can be rapidly hot-swapped in and out (see our review of the Axstor AX-DC SATA Hot Swap Drive Cages).

At about 15p per gigabyte, hard disk storage is very cost effective, and prices are likely to fall further as drive capacities continue to rise. Although hard disks are available in a range of capacities, the optimum cost per GB is achieved by buying drives with capacities that fall in the ‘sweet spot’. At present, in the first quarter of 2007, this is around 400GB:

For capacity, cost, convenience and backup/retrieval speed, hard disks are now hard to beat. The disadvantages are that they are bulky, heavy and inefficient for storing small volumes of data, while mechanically they are a little on the fragile side; long-term compatibility and longevity may also prove problematic.


Backup strategies and network design

Backup strategies should be based on the actual volume of data that needs to be backed up, rather than simply taking the maximum volume of available storage as a guide. Typically, office workers with PCs that have been in use for a couple of years are likely to be using no more that about 30GB of storage. Although individual workers' machines may be fitted with, for example, 80GB hard drives, in this case any backup solution only has to cope with just over a third of the total disk capacity. If sensible archiving strategies are used, then the volume of active data actually in use could perhaps be reduced by 50 per cent, down to 15GB per PC. This makes regular backups even of the entire volume of data stored on individual disks much more manageable.

Small offices of three of four people often have PCs set up as a peer-to-peer network, sharing a common gateway router for internet access. Although peer-to-peer networks are easy to set up, they don’t make efficient use of common data, such as address lists and client databases. There is a break-even point where a central office server makes more efficient use of storage and is more convenient in providing a centralised mechanism for backup and archiving.

For small-office peer-to-peer setups, a single external large hard disk or an external DVD writer can be used for backups. For offices where the network includes a server, the server can be fitted with a DVD writer and/or a SATA hot-swap drive cage.

Off-site Internet backup
One option that avoids the cost and inconvenience of installing and maintaining backup hardware and software is subscription to an off-site internet backup service. Of course, this does require an internet connection, but most businesses will already have an internet service installed. There are quite a number of companies offering off-site backups, via a local software client running in the background that encrypts the subscriber data and sends it over the internet for remote storage. These services operate on a monthly fee basis, which is scaled according to the amount of data stored. Typical subscription charges are around £10 a month for 0.5GB or £35 a month for 10GB. With no capital outlay, no setup fee and little staff time needed for training or operation, an internet backup service can seem a simple and attractive solution to the backup problem.

The advantages of these remote backup services are that they provide a measure of protection against local threats such as fire, flood or burglary by storing data off-site. However, this may also be seen as a disadvantage in that subscribers' data is no longer under their direct control. Customers must rely on the internet backup company to keep their data secure and always available. And what happens to your data if, for example, the internet backup company goes out of business?


Data, archiving and the law

In the past, businesses and legal systems dealt with data retention matters based on the assumption that all relevant business records would be on paper. Increasingly this is no longer the case, and many business transactions are now largely, or even completely, carried out electronically. The Data Protection Act (DPA) of 1998 gives individuals the right, on producing evidence of their identity, to have a copy of personal data held about them — including information contained in emails of a personal and biographical nature.

The DPA also requires organisations to take appropriate technical and organisational measures to prevent unauthorised or unlawful processing of personal data and against accidental loss or destruction of personal data. Archived data can be particularly vulnerable to all of these hazards.

Email can often be regarded as transitory, and many email clients have a capacity limit and aren’t designed to archive email indefinitely. However, there are a number of situations where companies are apparently legally required to retain email for up to six years. Small companies using external email services such as AOL may find themselves particularly exposed.

In reality it may prove to be impractical, or at least extremely difficult, for small companies to comply with future and even current legislation regarding data retention. Nevertheless, if they have not already done so, SMEs would be prudent to review their backup and archiving strategies as soon as possible.