Skip to main content

skip to main content

developerWorks  >  Information Management  >

Using DB2 Incremental Backup

developerWorks
Document options

Document options requiring JavaScript are not displayed


Learn and share!

Exchange know-how with your peers -- try our new Pass It Along beta app


Rate this page

Help us improve this content


Level: Introductory

Blair AdamacheIBM

09 May 2002

One of the key high availability features in DB2 Universal Database 7.2 for warehousing environments is incremental backup. This article describes the how incremental backup works, when you might want to use it, and strategies for ensuring smooth recoveries.

Introduction

DB2® Version 7.2 added many enhancements to improve the availability and recoverability of databases, among them:

  • On-demand log archive
  • Backup from split image
  • Dual logging
  • Veritas cluster support on Solaris

This article discusses one of the most powerful backup improvements for data warehouses: incremental backup. Incremental backup was added in Fixpak 3. Incremental backup means that only changes are backed up, which provides additional flexibility in designing a backup strategy for some types of databases environments.

Why back up just the changes?

Ever save a document in a word processor, make a small change, and then save the entire document again? Consider why you did this - because you wanted to be sure that your latest changes were saved. When you use a relational database, there's something more valuable than the hardware you buy to manage the data or the bright people who keep the machine and software running: the data itself is the ultimate asset. As long as you protect that data, everything else is a bonus.

DB2 has backup and restore commands native to the DB2 engine. Backups can be taken offline (when no users are connected to the database) or online, as the database is being changed. DB2 backups tend to be scheduled by the DBA to run at some interval: weekly, nightly or hourly, for example. The concept of "I've just changed a lot of things so now I'm going to save my work," does not apply to DB2 (except for users committing data, but that doesn't make an independent copy). Instead, the changes between each backup image are captured in the DB2 logs. These two pieces (the backup image and the logs representing all work since the backup) are the fundamental building blocks of a DB2 disaster recovery plan: Save the backup images and logs some place other than the machine handling the DB2 transactions, and you can restore your data even if the machine that runs your DB2 gets immersed in water and used as a boat anchor.

Captain's log

The concepts of a backup image and logs were designed for a traditional database model with lots of transactions, such as the inventory system in a warehouse. As relational databases complete their takeover of the Information Technology world, there is a lot of data that changes infrequently being stored in DB2. A database of an auto parts warehouse is very active (cars break more than software), but the majority of rows in a property tax database should change little because most homeowners stay on their property a long time after they move in. There is a mass of data on paper and microfilm that we need to save, and which would be far more valuable in a relational database, but the costs of backing up the contents of the Library of Congress for the 1% of it that changes every n weeks could negate the benefit of storing it digitally.

In multimedia applications, much of the data is stored as large objects (LOBs), which tend not to be logged, so even using the backup-plus-log strategy is flawed. If you do log LOBs, the transactional speed of the database will be degraded. For this reason, DB2 introduced incremental backup--a way to save only the changes since the last backup.



Back to top


Advantages of incremental backup

There are two ways to track changes and store them in another place for recovery:

  1. DB2 can write every INSERT, UPDATE, DELETE, CREATE, ALTER, DROP, GRANT and REVOKE and statement to the logs. When a restore is needed between the time of the last database backup and the time of a failure, DB2 can run through the log and re-create those changes, just like Sherlock Holmes reconstructing a crime by following the footsteps of each suspect. This is most efficient for a large number of transactions.
  2. Alternatively, DB2 can save a copy of each page when it changes, which is how incremental backup works.

If the database is highly active, it doesn't make sense to keep a copy of each page when it changes, because you may end up having a copy of every page in the database--essentially a new backup image-- which defeats the purpose of tracking only incremental page changes, and it is probably faster to log SQL. On the other hand, if all changes are concentrated in a small minority of the pages, or if most pages don't change, saving changed pages in an incremental backup image will save a lot of time and storage when creating backup images, because if nothing changes on the page the incremental backup skips it.

An incremental backup just makes backup and restore quicker for databases with a small amount of transactions by:

  • Making backup faster (You're only saving the pages that have changed since the last backup, not every page in the database.)
  • Making backup images smaller (You're only saving the pages that have changed, not all pages in the database.)

Incremental backups do not eliminate the need for logs. If you're running DB2 with rollforward recovery enabled (LOGRETAIN=ON in the database configuration), the logs allow you to roll forward to a specific point in time. Turn LOGRETAIN=ON in the database configuration); the logs allow you to roll forward to a specific point in time. Turn LOGRETAIN on as follows:

 
UPDATE DATABASE CONFIGURATION FOR SAMPLE USING LOGRETAIN ON 

This will force an immediate database backup to establish a point of consistency for backup images, and from this point on you'll have to save logs, so ensure you have the storage (disk, tapes, a storage manager like IBM Tivoli® Storage Manager, or storage managers from Legato or Veritas) to archive the backup images and logs.



Back to top


Enabling incremental backup

A database is enabled for incremental backup with the database configuration parameter TRACKMOD=ON (track modified pages). Enable this parameter as follows:

 
	UPDATE DATABASE CONFIGURATION FOR SAMPLE USING TRACKMOD ON 

Before allowing applications to change data, you must back up the database after turning on TRACKMOD.



Back to top


A range of backup data

There are two types of incremental backup:

  1. Full incremental backup. An image of all pages changed since the last complete backup, whether it is a full or tablespace backup image. An example of a full incremental backup is:
     
    db2 backup db sample online incremental use tsm 
    

  2. Delta. All pages changed since the last backup of any kind (another delta, an incremental, or a full backup image). An example of a delta backup is:
     
    db2 backup db sample online incremental delta use tsm 
    

This gives us up to four kinds of backup data from which to recover a damaged database:

  1. A full backup image. This is the building block of any recovery strategy, without a full backup image, you can't start to restore. If the backup is taken online, you require the logs of all transactions that transpired while the backup was taking place. Restore a full backup, replay the logs of all transactions since the backup, and your recovery is complete.
  2. An incremental backup. This includes all changes since the last full backup. Restore a full backup, restore the incremental backup, replay the logs since the incremental backup, and your recovery is complete.
  3. A delta backup. This includes all changes since the last backup of any kind. If the last backup was a full backup image, it and the delta provide the most complete backup. If a delta is preceded by an incremental backup image, you need the delta, the incremental backup, and the full backup image on which the incremental is based. If a delta is preceded by one or more deltas, you need all deltas until you reach an incremental backup or a full backup image.
  4. The logs. The logs contain all transactions since the last backup you were able to restore.

There is another dimension to all of this: DB2 supports backups at both the level of the entire database, and for specific tablespaces (a more granular level of backup and restore that allows you to restrict a backup or restore to critical tablespaces). Backing up and restoring incremental and delta images applies to tablespaces as well.



Back to top


Recovery strategies

Let's first consider the recovery strategy for a database that has a full backup each Sunday and a nightly incremental backup, as shown in Figure 1. (Logs are omitted from these pictures, but are necessary after the final restore.)

  • If you need to restore on Monday, restore with the Sunday night backup image, and apply all available logs.
  • If you need to restore on Tuesday through Sunday, restore the previous Sunday's full backup image, then restore the incremental backup image from the previous night, and finally apply the logs that followed the restored incremental image.

Any recovery needs:

  • One full backup image
  • Zero or one incremental backups
  • One day's logs

Note that each backup gets bigger until the next full backup. This is because there are more changed pages to back up on Saturday than Monday--six days' worth of work versus just one on Monday.


Figure 1: Full and incremental backup images
incremental backups

The nightly backup will be smaller if you just take a delta backup, as shown in Figure 2.


Figure 2: Full and delta backup images
delta backups

By using delta backups, the nightly backup on Monday through Saturday requires less storage and finishes more quickly. But, while the recovery strategy in Figure 1 requires only one full backup, one incremental, and logs, using nightly deltas requires all deltas since the last full or incremental backup.

To help you keep track of backups, the DB2 history file keeps track of the backup history for a database. See the DB2 Command Reference for the syntax of how to interrogate the database backup history. For example:

 
	LIST HISTORY BACKUP ALL FOR SAMPLE 

DB2 itself uses the history file to determine if any previous images are required for recovery and will attempt to restore them automatically. This may lead to a chain of backup images as shown in Figure 2.

Figure 3 shows how recovery requires a full backup image, the last incremental backup, all delta backups since the last incremental backup, and the logs. The delta images preceding the last incremental (Monday through Wednesday) are not needed. Note that if you have all logs following a full backup image, and one of the incremental images is damaged, the full image and the logs can still be used for recovery, just as in a database not using incremental backup.


Figure 3: Full, incremental and delta backup images
full backups

Because DB2 use the history file to figure out what comes next in incremental recovery, deleting entries from the history file (done with the PRUNE HISTORY command) should be done with caution. The history file is a piece of metadata as important to incremental backup as the system catalogs are to the database.

Automatic restore

If you restore using the INCREMENTAL AUTOMATIC keywords, DB2 determines what to restore:

 
RESTORE DB SAMPLE INCREMENTAL AUTOMATIC TAKEN AT (SAT)

Automatic restore was added to DB2 Version 7.2 in Fixpak 4. When you specify INCREMENTAL AUTOMATICC, DB2 will determine if any previous backup images are required and attempt to restore them automatically. The history file determines the sequence of required backup images. DB2 will start at the last backup image that contains complete copies of all tablespaces you are restoring, and then apply subsequent incremental images. Subsequent backup images do not need to contain all tablespaces you are restoring.

DB2 Version 7.2 also contains db2ckrst, a command that parses the history file and describes which backup images are required before you start a restore.



Back to top


Conclusion

Incremental backup is a good way to protect a database which is largely read-only, but has some INSERT/UPDATE/DELETE activity against it. It can also be useful for a database in which changes are isolated to a small subset of the tablespaces. Because an incremental backup strategy contains potentially four type of data for recovery (full images, incrementals, deltas and logs), keep track of what you're doing, and ensure all DBAs supporting the database understand the strategy and know how to read the DB2 history file. If you work in the type of company that can lose archived logs required for disaster recovery, the presence of incremental and delta images gives you more things to misplace. If you recycle tapes for logs, and don't know where to start when recovery is needed, establish some sound principles of a backup and recovery strategy:

  • Take a full backup before any fundamental change such as a migration, fixpak installation, or a major application change.
  • Use the DB2 command db2ckbkp to check the integrity of backup images. This command is documented in the DB2 Command Reference.
  • Use the DB2 command db2ckrst to query the history file and ensure that your understanding of what you require for an inremental restore is matched by DB2's record.
  • Practice a restore so you know how long it takes, and where to find all relevant pieces such as backup images and logs. Multiple people need to practice this, so that you can go on vacation or retire.
  • Understand the database schema if you plan to use rollforward point-in-time recovery and/or to restore tablespaces one at a time. Restoring a particular tablespace first does not improve availability if it contains tables with referential constraints to tables in other tablespaces.
  • Make a decision on what you'll do if a backup ever fails to complete:
    • Do you complete the backup before allowing production to continue?
    • Will you have sufficient logs to bridge back to the previous backup image?
    • What happens to your backup strategy if users demand access to the database before a backup completes?
    • If using online backup, recognize that the backup image is useless without the logs of all transactions that occurred while DB2 was backing up the database.

When disaster strikes, DBAs can make things worse if they don't know the rules above. If they do follow these rules, they can be like Ed Harris in Apollo XIII, and ensure that the disaster is your organization's finest hour.

IBM, DB2, and Tivoli are trademarks or registered trademarks of IBM Corporation in the United States, other countries, or both.

Other company, product, and service names may be trademarks or service marks of others.



About the author

Blair Adamache has been working in relational database development and service at IBM Toronto Lab since 1987. Blair can be reached at badamache@yahoo.com. Miro Flasza has been one of the DB2 Load and Data Management gurus since DB2 Version 5. Prior to V5, Miro worked on extending DB2 functionality for geo-spatial applications as part of his Masters thesis research. Miro can be reached at miro@ca.ibm.com. Dale McInnis has been part of DB2 Development since Version 1. He owns DB2 Backup, Restore and Recovery, and owns online backup and the generic vendor interface for backup and recovery used by several storage vendors. Dale can be reached at dmcinnis@ca.ibm.com.




Rate this page


Please take a moment to complete this form to help us better serve you.



 


 


Not
useful
Extremely
useful
 


Share this....

digg Digg this story del.icio.us del.icio.us Slashdot Slashdot it!



Back to top