DOCUMENTATION: Catalog recovery procedure: an example of disaster recovery.
Details: Sample Catalog recovery procedure
This is only an example of procedures used. Not all steps are required and may not be applicable to all situations.
These directions assume that NetBackup has already been installed on the DR (disaster recovery) server.
Note: these instructions describe a full recovery of the master for redeployment into production.
1. Load media into DR site robot.
For safety, only load the catalog tape initially.
Alternately, load all media as write protected.
2. Validate the bp.conf and vm.conf (if applicable) configuration files.
Additional things to check:
- Check /usr/openv/volmgr/vm.conf for a MEDIA_ID_BARCODE_CHARS entry, if it is needed.
- Check /usr/openv/netbackup/db/config for the touch files NUMBER_DATA_BUFFERS and SIZE_DATA_BUFFERS.
- Check /usr/openv/netbackup for the touch file NET_BUFFER_SZ.
3. In the GUI, run the device configuration wizard. Uncheck all servers except the master server.
Run the wizard to configure the devices.
On the first result dialog, check for any limitations.
To change from the default drive densities, if desired:
In the
Drag and Drop Configuration dialog, select the drive and click on the
Properties button. Verify the
Drive Density.
Repeat this for each drive.
In the
Configure Storage Unit dialog, select the storage unit and click
Properties to verify the storage unit settings.
4. Run an inventory of the robot.Preview the inventory and verify the media type is correct.
Make note of the barcodes and any changes from the production site, as different robots can return different barcodes for the same tape. For instance, LTO tapes can have "L#" on the end of the barcode. This can be disabled via the robot console option for short labels.
Update the volume configuration.
5. Add the following line to bp.conf:RESOURCE_MONITOR_INTERVAL = 3600This will change media server polling from 10 minutes to 1 hour.
6. Make copies of the DR environment bp.conf and vm.conf files# cd /usr/openv/netbackup# cp bp.conf bp.conf.dr# cd ../volmgr# cp vm.conf vm.conf.dr7. Recover the entire catalog.This is performed from the GUI on the master server. Always log in to the Java GUI using the short hostname, as the fully qualified domain name may not match the production site.
Note: Optionally, the catalog recovery can be performed from the command line:
# /usr/openv/netbackup/bin/admincmd/bprecover -wizard8. Manually deactivate all backup policies.From the GUI, select all policies, right click and select
Deactivate. This may take a while.
Be sure that all policies are deactivated before proceeding.
9. Shut down NetBackup.# /usr/openv/netbackup/bin/bp.kill_allVerify with
bpps -x that only
/opt/VRTSpbx/bin/pbx_exchange is running.
10. Prep the bp.conf and vm.conf configuration files.Copy
bp.conf and
vm.conf to
bp.conf.prod and
vm.conf.prod:
# cd /usr/openv/netbackup# cp bp.conf bp.conf.prod# cd ../volmgr# cp vm.conf vm.conf.prodThen, copy back the
bp.conf.dr and
vm.conf.dr to
bp.conf and
vm.conf:
# cd /usr/openv/netbackup# cp bp.conf.dr bp.conf# cd ../volmgr# cp vm.conf.dr vm.confVerify that the hostnames of any remote Windows console servers are included in the
bp.conf with a
SERVER entry.
11. Make sure bp.conf and vm.conf are configured correctly to reflect DR environment. Append
FORCE_RESTORE_MEDIA_SERVER entries to
bp.conf for each media server not present at DR that were used to do backups in production. The syntax of these entries is as follows:
FORCE_RESTORE_MEDIA_SERVER = 12. Perform a partial startup nbemm.This will allow modification of the
nbemm database without the job manager running and kicking off jobs.
# /usr/openv/netbackup/bin/nbdbms_start_stop start# /usr/openv/netbackup/bin/nbemmRun
bpps -x to verify that
nbemm and
NB_dbsrv are running
13. Deactivate all media servers not participating in the DR.# /usr/openv/netbackup/bin/admincmd/nbemmcmd -updatehost -machinename -machinestateop set_admin_pause -machinetype media -masterserver Be sure to execute this command against every unavailable media server.
14. Start
nbevtmgr and
bpdbm.
# /usr/openv/netbackup/bin/nbevtmgr# /usr/openv/netbackup/bin/initbpdbmAgain, run
bpps -x to verify that
bpdbm and
nbevtmgr are running.
15. Delete all storage units.This step is optional, but will give a cleaner experience. Either use the GUI or the command line.
From the command line:
# /usr/openv/netbackup/bin/admincmd/bpstulist -go | cut -f 1 -d ' ' > /tmp/stu_groups# /usr/openv/netbackup/bin/admincmd/bpstulist | cut -f 1 -d ' ' > /tmp/stu_list# for i in `cat /tmp/stu_groups` ; do echo "/usr/openv/netbackup/bin/admincmd/bpstudel -group $i" ; done >> /tmp/delete_stu_groups# for i in `cat /tmp/stu_list` ; do echo "/usr/openv/netbackup/bin/admincmd/bpstudel -label $i" ; done >> /tmp/delete_stus# sh /tmp/delete_stu_groupsNote: Be sure to delete storage unit groups first prior to deleting storage units!
16. Delete all tape devices from the command line.# /usr/openv/netbackup/bin/admincmd/nbemmcmd -deletealldevices -allrecordsVerify no devices are returned:
# /usr/openv/volmgr/bin/tpconfig -emm_dev_list -noverbose17. Stop and restart NetBackup.# /usr/openv/netbackup/bin/bp.kill_allUse
bpps -x to verify that only
/opt/VRTSpbx/bin/pbx_exchange is running.
# /usr/openv/netbackup/bin/bp.start_all...# bpps -xNB Processes------------root 13809 13808 0 15:05:07 ? 0:00 /usr/openv/netbackup/bin/nbproxy dblib nbjmroot 13775 1 0 15:05:03 ? 0:00 /usr/openv/netbackup/bin/bpcompatdroot 13786 1 1 15:05:04 ? 0:00 /usr/openv/netbackup/bin/bpdbmroot 13757 1 0 15:05:01 ? 0:00 /usr/openv/netbackup/bin/nbrbroot 13747 1 0 15:04:59 ? 0:00 /usr/openv/netbackup/bin/nbevtmgrroot 13795 13786 0 15:05:05 ? 0:00 /usr/openv/netbackup/bin/bpjobdroot 13856 1 0 15:05:12 ? 0:00 /usr/openv/netbackup/bin/nbsvcmonroot 13813 1 1 15:05:08 ? 0:01 /usr/openv/netbackup/bin/nbstservroot 13811 13810 1 15:05:07 ? 0:01 /usr/openv/netbackup/bin/nbproxy dblib nbpemroot 13818 1 1 15:05:09 ? 0:01 /usr/openv/netbackup/bin/nbrmmsroot 13752 1 2 15:05:00 ? 0:02 /usr/openv/netbackup/bin/nbemmroot 13808 13797 0 15:05:07 ? 0:00 sh -c "/usr/openv/netbackup/bin/nbproxy" dblib nbjmroot 13770 1 1 15:05:02 ? 0:01 /usr/openv/netbackup/bin/bprdroot 13844 1 0 15:05:11 ? 0:00 /usr/openv/netbackup/bin/nbslroot 13797 1 0 15:05:05 ? 0:00 /usr/openv/netbackup/bin/nbjmroot 13810 13804 0 15:05:07 ? 0:00 sh -c "/usr/openv/netbackup/bin/nbproxy" dblib nbpemroot 13804 1 0 15:05:06 ? 0:00 /usr/openv/netbackup/bin/nbpemroot 13742 1 0 15:04:57 ? 0:02 /usr/openv/db/bin/NB_dbsrv MM Processes------------root 13783 1 1 15:05:04 ? 0:01 vmd -vShared Symantec Processes-------------------------root 142 1 1 16:46:54 ? 0:52 /opt/VRTSpbx/bin/pbx_exchange18. In the GUI, run the device configuration wizard to configure shared drives. Uncheck all servers except the robot control host.
Run the wizard to configure the devices
On the first result dialog, check for any limitations
To change from the default drive densities, if desired:
In the
Drag and Drop Configuration dialog, select the drive and click on the
Properties button. Verify the Drive Density.
Repeat for each drive.
In the
Configure Storage Unit dialog, select the storage unit and click
Properties to verify the storage unit settings.
If needed, repeat the process for any additional servers.
Using the Device Monitor, make sure that no drives have the
RESTART bit set. Restart
ltid on the servers if needed.
19. Run the robot inventory.Before running the inventory, use the GUI to verify all the recovery media are set to non-robotic.
If they are not, select all robotic media, right click and select
Move. Make sure
Volume is in a robotic library is
unchecked.
Note: The volume group may be "---" - this is okay.
Hit
OK.
Make note of the barcodes and any changes from the production site (the presence of the "L#" tag). The hardware should be able to toggle the "L#" tag (long vs. short labels). If that is not possible, the barcode of the media can be changed with the following command:
# /usr/openv/volmgr/bin/vmchange -barcode -m 20. Verify that restores work.More information on DR procedures can be found in Chapter 7 of the NetBackup Troubleshooting Guide (linked below).