Beyond having checklists and runbooks, what else can you do to test your backups?
In the comprehensive article called “How to Prove Your IBM i is Recoverable without a Real DR Test,” Tom Huntington from HelpSystems details all the IBM i Save Commands and system objects that should be included within a comprehensive backup. By performing an audit of your backup process, you might identify missing components that prevent you from doing the worst-case scenario, a full system restore. It is a great article that IBM i administrators should review.
The article’s title should include “…..without a Real DR Test,” and of course, the reason is that performing an actual DR test often requires lengthy logistics, planning, and coordination across multiple groups in your organization. An actual DR test also includes the expense of activating the DR site, which might also incur costs of additional assets to staff and manage the event. And then if your backup restoration fails, then what? Start over partially or completely, maybe even reschedule. Of course, doing DR rehearsals and tests is mandatory for any organization, but what are the options?
Why not practice the full system restore on a spare LPAR that you already have? If you have the resources, it would seem possible to configure a “backup practice” LPAR of a similar dimension to the system of record and practice restoring to it. Some organizations have the infrastructure resources to do this, and some do not. If your LPAR is large or perhaps runs a canned ISV solution, you might not have the capacity to duplicate the system of record. The only way to practice doing a restore test is actually to activate your DR plan.
There is another way. That is by using the cloud. All of the major cloud vendors now have the ability to spin up IBM Power-based LPARs of either IBM i or AIX. Using the cloud, you are travelling down the magic middle ground. You aren’t incurring the full-blown expense of activating the DR plan, but you can test doing a full restore “at your leisure.” Yes, burning resources in the cloud costs money, but probably not as much as the total costs of doing a full DR test. So in the cloud, it might be possible to do multiple iterations of a backup/restore practice process that needs improving. Some cloud vendors that support IBM i or AIX also have reduced costs if servers are not “powered on.” You’ll be consuming disk storage for the OS and data, but the main compute/memory costs are greatly reduced or eliminated when the LPAR is powered off. Under this model, you might be able to practice the process of doing a full system restore over the course of a week or more instead of being rushed to complete everything quickly when using a traditional DR recovery site or resources.
Lastly, pitch the idea of using the cloud to test some of your IBM i data restoration processes as a way to take the pressure off of you, the system administrator. If the answer is “no,” oh well, at least you asked. If the answer is “yes,” then take advantage of one of the latest cool things in cloud computing. Reduce your stress, work on your backup/restore documentation, policies, and procedures, and get some cloud experience, all by testing the ability to restore your IBM i backup in the cloud.