Database Backup and Recovery Strategy
Problem Description
Database backup and recovery are core mechanisms for ensuring data security and business continuity. They involve strategies such as physical backup, logical backup, full/incremental/differential backup, and log-based recovery technologies. Interviewers expect you to master the characteristics and applicable scenarios of different backup methods and to be able to design reasonable recovery plans.
I. Basic Objectives and Classification of Backups
-
Core Objectives:
- Data Reliability: Prevent data loss caused by hardware failures or human errors.
- Recovery Time Objective (RTO): Maximum allowable time from business interruption to recovery.
- Recovery Point Objective (RPO): The point in time for acceptable data loss (e.g., tolerating at most 5 minutes of data loss).
-
Backup Classification Dimensions:
- By Content: Physical backup (data files/block-level), Logical backup (SQL statements).
- By Data Volume: Full backup, Incremental backup, Differential backup.
- By Business Impact: Hot backup (online backup), Cold backup (offline backup).
II. Detailed Explanation: Physical Backup vs. Logical Backup
-
Physical Backup:
- Target: Directly copies the physical files of the database (e.g., MySQL's .ibd files, PostgreSQL's base directory).
- Characteristics:
- Fast backup/recovery speed (file-level operations).
- Strongly related to the storage engine (requires ensuring storage engine consistency during backup).
- Usually requires combining with logs for Point-in-Time Recovery (PITR).
- Applicable Scenarios: Online systems with large data volumes and short recovery windows.
-
Logical Backup:
- Target: Exports SQL statements for table structures and data (e.g.,
mysqldump,pg_dump). - Characteristics:
- High readability, supports cross-version/engine migration.
- Slow backup speed (requires executing SQL queries).
- May lose storage-specific features (e.g., partitioned table structures).
- Applicable Scenarios: Small-scale data migration, long-term archiving.
- Target: Exports SQL statements for table structures and data (e.g.,
III. Design of Full/Incremental/Differential Backup Strategies
-
Full Backup:
- Backs up all data each time; occupies large space but recovery is simple.
- Recovery Steps: Directly restore the latest full backup.
-
Incremental Backup:
- Backs up only the data changed since the last backup (requires recording LSN or timestamp).
- Recovery Steps:
Restore the most recent full backup → Apply all incremental backups in sequence. - Space Advantage: Minimal backup volume, but data loss occurs if the recovery chain breaks.
-
Differential Backup:
- Backs up all data changed since the last full backup.
- Recovery Steps:
Restore the most recent full backup → Apply the latest differential backup. - Reliability Advantage: Recovery requires only two files, offering stronger fault tolerance.
-
Example of Combined Strategy (Common in Enterprises):
- Schedule: Full backup every Sunday + Incremental backups Monday through Saturday.
- Space Calculation: Assuming a daily data change rate of 5%, the total weekly backup volume ≈ 1 + 6×0.05 = 1.3 times the full data volume.
IV. Precise Recovery Technology Based on Logs
-
Role of Binary Log:
- Records all data change SQL statements or row changes (ROW format).
- Used for incremental backup replication, master-slave synchronization, and Point-in-Time Recovery (PITR).
-
Point-in-Time Recovery (PITR) Process:
- Step 1: Restore the most recent full backup (e.g., backup from Sunday at 23:00).
- Step 2: Find the log position corresponding to the backup (value recorded by
SHOW MASTER STATUS). - Step 3: Replay the binary log up to the moment before the failure:
mysqlbinlog --start-position=107 --stop-datetime="2023-10-01 10:30:00" binlog.00001 | mysql -u root -p - Key Requirement: Continuously archive binary logs during backup and ensure their integrity.
V. Disaster Recovery Solutions and Backup Strategy Design Cases
-
Cross-Data Center Disaster Recovery Architecture:
- Local Backup: Daily full backup + hourly incremental backups, retained for 7 days.
- Remote Synchronization: Real-time synchronization to a remote slave database via binary logs.
- Recovery Testing: Simulate recovery monthly to verify backup availability.
-
Cloud Database Backup Strategy:
- Automatic Snapshots: Utilize cloud platform snapshot features to generate daily full snapshots.
- Log Hosting: Use cloud database log storage functions (e.g., AWS RDS PITR support).
- Cost Trade-off: Select storage types based on RTO/RPO (e.g., standard storage transitioning to archival storage).
VI. Extension of Common Interview Questions
- Question 1: "How to recover after accidentally deleting a table?"
- Answer: Restore the table individually from a logical backup, or extract the table's data from a full backup + logs.
- Question 2: "How to verify backup validity?"
- Answer: Regularly conduct recovery drills and check data consistency (using tools like
mysqldbcompare).
- Answer: Regularly conduct recovery drills and check data consistency (using tools like
Through the layered explanation above, you can understand that backup strategies need to be comprehensively designed based on business requirements (RTO/RPO) and data characteristics (data volume/change frequency). You will also master the end-to-end knowledge from basic concepts to enterprise-level practices.