Is disaster recovery testing putting your company at risk?

Commentary--Bus Tech's Jim O'Connor says businesses must have a way to ensure continuous data replication, even during disaster recovery testing.
Written by Jim O'Connor, Bus-Tech, Contributor
Commentary--To quell the growing concerns over data theft, many companies have switched from physical tape backup to disk-based solutions and data vaulting whereby data is transmitted to a disaster recovery site over a public or private network.

While this alleviates the obvious worries over physical backup tapes, it presents some considerations for disaster recovery (DR) testing. In a peer-to-peer implementation when businesses are conducting their DR testing, some businesses tend to break the data replication process—putting them at risk for lengthy data recovery delays and the possibility of non-compliance with regulatory guidelines. Using appropriate techniques with an electronic data vaulting implementation, replication can continue uninterrupted during DR testing and eliminate any exposure of your production site and DR site being out of synch.

And it appears that compliance legislation will remain a serious issue for companies for the foreseeable future. Regulations such as Sarbanes-Oxley, Government Securities Act Regulation 17, FDA 21 CFR Part 11 and HIPAA, increase compliance exposure for companies, and demand more aggressive measures to lower risk. In the case of a legal discovery, where companies may be required to submit company data as evidence, it is imperative that this information be factual, up-to-date, and compliant. If data is unavailable for several hours or days during DR testing, then companies will not be able to produce the requested information in the time required and may put your company at further risk for non compliance.

Why physical tape storage is no longer a practical solution
Businesses today are realizing the perils of merely storing data tapes offsite as part of their disaster recovery plan. There are simply too many opportunities for tapes to become lost, damaged, or stolen in transit or at the offsite destination. In addition, data recovery from tape is far too slow. Typically, records storage vendors guarantee they will be able to retrieve a tape within 24 hours of having picked it up at the customer site. And that doesn't include data restoration time. If a disaster occurs, IT managers must spend several days locating disparate data sets across tape volumes and restoring tapes in the proper sequential order. This is an extremely laborious process that typically exceeds time-to-recovery compliance requirements. To avoid these problems, many companies have moved to peer-to-peer data vaulting for their replication and DR needs.

The advantages and best practices for peer-to-peer data vaulting
In peer-to-peer data vaulting, data is replicated to a remote disk sub-system via a standard open-system communication link. This means information from a company's main site is scheduled to automatically replicate data to an off-site location, whether that is daily or scheduled as needed. Co-locating company data at two separate sites allows for businesses to eliminate the need for tape, and restoring company data is much faster and cuts down the restore window from days to hours. However, peer-to-peer replication can present some issues. For example, when a business initiates its disaster recovery testing, if it breaks the replication link between the two sites, it effectively takes the remote replication process off-line completely. This link is broken for the length of the DR test which can be days, and once re-established can take some time to get data back in synch while the backup site catches up with the primary site. This leaves businesses vulnerable if a disaster occurs during that time or a compliance event takes place that requires immediate access to company data.

A safer method of mainframe data recovery testing
A more prudent method of data replication involves taking a "snapshot" or point-in-time copy of the recovery data and conducting disaster recovery testing off of that image. This technology leverages traditional open systems disk technology, where it vaults the daily back up information via standard IP protocols to a remote location (or several locations) and then is able to capture a snapshot of the data representing a point-in-time copy of the recovery data. Once the snapshot is created, companies are then able to do recovery testing from the copy while allowing regular daily data replication to continue uninterrupted. With this approach, the organization never has to break the link between the production and disaster recovery sites. If a system goes down, recovery can occur in just hours instead of days.

This data vaulting approach has another side benefit. Many businesses are subject to government rules mandating the citing of an additional remote recovery site in case of a regional disaster that wipes out the local production and recovery sites. Brokerage firms, for example, need to abide by the Government Securities Act Regulation 17, which requires that company data be stored at a third-party location and also have a secondary recovery site at least 200 miles away. IP-based data vaulting, as described above, makes transmission to multiple DR sites easy to accomplish and without the need for proprietary hardware and communications protocols. This capability and the ability to do DR testing without interrupting remote data transmission makes virtual tape and disk-based peer-to-peer replication the safest disaster recovery solution for financial institutions, as well as most industries.

A real-world example
Many companies have learned firsthand the advantages of this approach. Take for example if an insurance company relies on IBM Mainframe tapes for its data backup as well as disaster recovery and provisions data between two location sites. Typically, the nightly backup of records and system files can take upwards of four hours to complete. In addition, transporting and storing tapes at a local facility for disaster recovery purposes is not only costly but requires a lot of manual coordination and effort. And to exacerbate matters further, the four hours or more required for tape backup can have a domino effect on overall system availability.

It's also important to consider that companies that have nightly batch processing can not start up the next day until all backups are completed. Delays in starting the batch run means that batch processing can still be taking place while employees are coming into work the next day. Employees would now have to wait for the batch processing to finish before they could have access to their applications.

To alleviate these challenges, a company can switch to a totally disk-based solution where mainframe data is replicated from their main data center to a disaster recovery site(s) using a standard IP-based protocol. All tape-based I/O and processes will continue to function exactly as before even though there are no tapes involved and all data storage and retrieval can now be replicated via IP protocols to high-performance, error-protected disk storage.

Although many industry analysts have produced research reports stating tangible costs of downtime per hour based on the particular industry sector, arguably the intangible costs associated with customer perception of lost data records or consumer facing service outages are far higher. Such scenarios may generate consumer based perception issues that could damage company reputations for many months or more.

Physical tape backup is no longer a viable option in an era of alarming security and compliance breaches. Peer-to-peer data vaulting is an improvement, yet businesses must have a way to ensure continuous data replication, even during disaster recovery testing. Taking a point-in-time snapshot of mainframe systems data alleviates these security and compliance exposures, as well as speeding up recovery time and giving organizations a powerful means to avoid system interruptions and possible costly downtime.

Jim O'Connor, is vice president of Product Marketing at Bus-Tech.

Security checklist for ensuring information privacy

Network security—Eliminate guessable passwords and requiring that employees renew passwords on a regular basis.

Dynamic security—Enact measures to ensure that access controls can be revoked immediately for any terminated employees.

Facility protection—Restrict access to buildings via security guards, surveillance cameras, and/or security card readers.

Asset tracking—Invest in anti-theft locks for laptop computers throughout the office so no one walks off with valuable information. These inexpensive locks can save thousands in lost equipment and confidential data. Encryption—Secure electronic records from prying eyes.

Audit trails—Keep track of who accessed, edited, or printed electronic files.

Virtual tape—Move files to a secure location without the inherent risks of using physical tapes that can be lost or stolen.

Editorial standards