Detailed Guide to Database Backup and Recovery Strategies

Detailed Guide to Database Backup and Recovery Strategies

Problem Description
Database backup and recovery are core technologies for ensuring data security and business continuity. Interviewers will assess your understanding of backup types (physical/logical, full/incremental/differential), recovery strategies (PITR, disaster recovery), and their practical application. You need to master the principles of different backup methods, compare their pros and cons, and know how to design a complete backup and recovery plan.

Fundamental Backup Concepts

  1. Backup Objectives

    • RTO (Recovery Time Objective): The maximum tolerable business interruption time.
      Example: If recovery within 4 hours is required, a fast recovery solution must be chosen.
    • RPO (Recovery Point Objective): The maximum tolerable amount of data loss.
      Example: If at most 15 minutes of data loss is acceptable, frequent backups or log archiving must be enabled.
  2. Backup Type Comparison

    • Physical Backup: Directly copies database files (e.g., MySQL's .ibd files).
      → Advantages: Fast recovery speed, supports block-level incremental backups.
      → Disadvantages: Bound to the storage engine, poor cross-platform compatibility.
    • Logical Backup: Exports data via SQL statements (e.g., mysqldump).
      → Advantages: High readability, compatible with different versions.
      → Disadvantages: Slow recovery, time-consuming for large databases.

Detailed Explanation of Full/Incremental/Differential Backups

  1. Full Backup

    • Scenario: Executed every Sunday at midnight, creates a complete data snapshot.
    • Recovery Process: Simply restore the latest full backup.
    • Storage Usage: Highest, but recovery steps are the simplest.
  2. Incremental Backup

    • Principle: Only backs up data blocks changed since the last backup (relies on binary logs or LSN).
    • Recovery Process (Example: Data loss on Monday):
      graph LR
      A[Restore Sunday's Full Backup] → C[Apply Monday's Incremental Logs] → D[Data Recovery Complete]
      
    • Advantages: Fast backup speed, low storage usage.
    • Risks: Chain dependency; corruption of any backup in the chain renders subsequent ones unusable.
  3. Differential Backup

    • Principle: Always based on changes since the last full backup (e.g., Tuesday's backup contains all changes from Monday to Tuesday).
    • Recovery Process (Example: Crash on Wednesday):
      graph LR
      A[Restore Sunday's Full Backup] → B[Apply Tuesday's Differential Backup] → D[Data Recovery Complete]
      
    • Balanced Trade-off: Recovery is faster than incremental backup; storage usage is lower than full backup.

Practical Recovery Strategy Design

  1. Point-in-Time Recovery (PITR) Principle

    • Prerequisite: Enable binary logging and perform regular log archiving.
    • Key Steps:
      1. Restore the most recent full backup (e.g., from Monday early morning).
      2. Replay binary logs up to the moment before the failure (e.g., Tuesday 14:30).
      mysqlbinlog --stop-datetime="2023-10-01 14:30:00" binlog.000012 | mysql -u root -p
      
  2. Disaster Recovery Plan

    • Cross-Region Backup: Synchronize full backups to a geographically remote data center monthly.
    • Validation Mechanism: Conduct regular recovery drills to ensure backup validity.
    • Monitoring Metrics: Backup success rate, trend of backup size changes, recovery test duration.

Common Pitfalls and Optimizations

  1. Backup Lock Issues

    • Error Example: Using mysqldump without --single-transaction causing table locks.
    • Correct Approach:
      mysqldump --single-transaction --master-data=2 db_name > backup.sql
      
  2. Log Management

    • Risk: Binary logs not regularly purged can fill up the disk.
    • Configuration Recommendations:
      # MySQL Configuration
      expire_logs_days=7
      purge_binary_logs_before=20231001
      
  3. Special Handling for Cloud Databases

    • Automated Backup: Utilize cloud platform snapshot features (e.g., AWS RDS automatic backups).
    • Cross-Region Replication: Configure log synchronization to a standby region for hot standby.

Summary
Designing a backup strategy requires a comprehensive consideration of business requirements (RTO/RPO), storage costs, and operational complexity. A typical plan involves: daily incremental backups + weekly full backups + real-time binary log archiving, coupled with quarterly disaster recovery drills. In actual interviews, you can explain how to adjust backup frequency and validation mechanisms based on specific scenarios (e.g., before major e-commerce sales events).