Database Backup and Recovery Strategy

Database Backup and Recovery Strategy

Description
Database backup and recovery are core technologies for ensuring data security and business continuity. Backup refers to the process of copying data from a database to other storage media, while recovery involves restoring the data to a usable state using backups after data loss or corruption. The core challenge lies in balancing the impact of backups on system performance with the time objective for recovery (RTO) and the tolerance for data loss (RPO).

Knowledge Point Explanation

1. Main Types of Backup
Backup is not simply copying files; strategies must be chosen based on data characteristics.

  • Physical Backup vs. Logical Backup

    • Physical Backup: Directly copies the physical files of the database (e.g., data files, control files, redo log files). Analogous to taking a picture of the entire house. Advantages are high speed (especially for full recovery) and tight coupling with the storage engine; disadvantages are large backup files and typically being bound to specific database versions and operating systems, resulting in poor flexibility.
    • Logical Backup: Exports the logical structures (e.g., tables, views, stored procedures) and data from the database as SQL statements or text files in a specific format. Analogous to creating an inventory list and placement map for all furniture in the house. Advantages are flexible recovery (e.g., single-table recovery) and good compatibility (cross-version and cross-platform); disadvantages are slower speed (requires SQL execution for restoration) and potential exclusion of certain physical properties like indexes.
  • Hot Backup vs. Cold Backup

    • Hot Backup (Online Backup): Performed while the database is running and normally serving business operations. This requires the database to be in archive log mode. During backup, operations such as inserts, updates, and deletes are simultaneously recorded in redo logs, ensuring data consistency in the backup. This is the most common method in production environments.
    • Cold Backup (Offline Backup): Performed after the database is completely shut down. There are no user connections at this time, and data files are in a consistent state. The operation is simple but requires service downtime, impacting availability.

2. Evolution of Backup Strategy: From Full to Incremental
To reduce backup data volume and time, various combined backup strategies have been developed.

  • Full Backup

    • Description: Each backup copies all data in the entire database. This forms the foundation for all recovery operations.
    • Advantages: Recovery operation is simplest, requiring only one backup file.
    • Disadvantages: Consumes significant storage space, takes a long time, and heavily consumes system resources.
  • Incremental Backup

    • Description: Only backs up data blocks or data that have changed since the last backup of any type.
    • Process:
      1. Perform a full backup on Sunday.
      2. Monday's incremental backup only backs up data changed after Sunday's full backup.
      3. Tuesday's incremental backup only backs up data changed after Monday's incremental backup (note: not relative to Sunday).
    • Advantages: Fast backup speed and small space usage.
    • Disadvantages: Recovery is more complex, requiring the sequential application of all incremental backups in chronological order. For example, to recover to Wednesday's state, you must first restore Sunday's full backup, then apply Monday's incremental, and finally apply Tuesday's incremental.
  • Differential Backup

    • Description: Only backs up all data that has changed since the last full backup.
    • Process:
      1. Perform a full backup on Sunday.
      2. Monday's differential backup backs up all data changed since Sunday.
      3. Tuesday's differential backup still backs up all data changed since Sunday (therefore, Tuesday's backup includes Monday's changes).
    • Advantages: Recovery is simpler than with incremental backups. Only two backups are needed: the latest full backup and the latest differential backup.
    • Disadvantages: The size and time required for backup files grow with the time elapsed since the last full backup.

3. Types and Process of Recovery
Recovery is the process of restoring the database to a specific point in time using backups and logs.

  • Instance Recovery: Occurs automatically when the database restarts after an abnormal shutdown (e.g., power failure). The database uses online redo log files to roll forward all committed transactions not yet written to data files and roll back all uncommitted transactions, ensuring data consistency. This does not require manual DBA intervention.

  • Media Recovery: Manual recovery required when data files themselves are corrupted or lost. This is our main focus.

    • Complete Recovery: Restores the database to the last consistent state before the failure, ensuring zero data loss.
      • Process (using data file corruption as an example):
        1. Identify Fault: The database reports an error when attempting to read the corrupted file.
        2. Take File Offline: Set the corrupted data file to offline status.
        3. Restore: Copy a good data file from the backup to the original location.
        4. Recover: Apply archived logs and online redo logs to "roll forward" the restored old data file to the current point in time. This process replays all data changes made since the backup was created.
        5. Bring File Online: Bring the data file back online, restoring normal database operation.
    • Incomplete Recovery (Point-in-Time Recovery): Restores the database to a specific past point in time. Commonly used for human errors (e.g., accidental table deletion).
      • Process:
        1. Restore a copy of all data files from a backup (ensuring these files are from the same full backup point in time).
        2. Apply archived logs and online redo logs, but only up to the point in time (T) just before the erroneous operation.
        3. Open the database with RESETLOGS, which creates a new log sequence and discards all changes made after time T.

4. Practical Strategy: Combination and Best Practices
A robust backup and recovery plan is usually a combination of multiple types.

  • Typical Example:

    • Perform a weekly full hot backup.
    • Perform a daily differential backup.
    • Archive redo logs every 15 minutes.
  • Recovery Scenario Simulation (Data File Corruption at 10:05 AM on Wednesday):

    1. RPO Goal: Lose at most 15 minutes of data (i.e., recover to the 10:00 state).
    2. Recovery Steps:
      a. Restore Base Files: Apply last Sunday's full backup.
      b. Apply Differential Backup: Apply the differential backup made Tuesday night (containing all changes from Monday to Tuesday). The database state is now restored to Tuesday night.
      c. Apply Archived Logs: Sequentially apply all archived log files generated from Tuesday night to 10:00 AM Wednesday. This step "rolls forward" the database to the 10:00 state.
      d. Complete Recovery: The database is restored to 10:00, losing only operations from 10:00 to 10:05 (this data can potentially be re-entered via other means based on business logic).

Summary
Mastering database backup and recovery hinges on understanding the characteristics and applicable scenarios of each backup type and designing a reasonable combination of backup strategies based on business requirements for RTO and RPO. The core idea is: trading space (storage for backups) and processes (backup operations) for time (recovery speed) and security (data integrity). Regularly conducting recovery drills is the only way to verify backup effectiveness.