Database Data Replication and High Availability Architecture
I. Topic Description
Data replication is one of the core technologies in database systems, referring to the process of copying the same data from one database server (the primary node) to one or more other servers (secondary nodes). High availability architecture, based on technologies like data replication, builds system designs that can quickly respond to single points of failure and ensure continuous service availability. This topic covers the core processes of data replication, topology structures, consistency models, and how to utilize these technologies to construct high availability architectures.
II. Core Process of Data Replication
The complete data replication process can be broken down into the following three core steps:
-
Data Capture
- Description: On the primary node (Master), any operation that causes a data change (e.g., INSERT, UPDATE, DELETE) needs to be identified and recorded to be propagated to secondary nodes.
- Implementation: The most mainstream method is based on logs. The database sequentially records all transactional operations in a binary log (e.g., MySQL) or write-ahead log (WAL, e.g., PostgreSQL). The replication process does not copy SQL statements or data rows directly, but rather copies the "events" from these logs.
- Advantage: Logs are append-only writes, making sequential I/O efficient. They record the original information of data changes, ensuring the accuracy and reliability of replication.
-
Data Transmission
- Description: Reliably sending the captured data change information (log events) from the primary node to one or more secondary nodes.
- Implementation:
- Asynchronous Transmission: The primary node responds to the client immediately after committing a transaction, then sends the log events to the secondary node(s) in the "background." This is the default or common mode, offering good performance but carries a risk of data loss (unsent log events are lost if the primary node fails).
- Semi-Synchronous Transmission: When the primary node commits a transaction, it must wait for at least one secondary node to acknowledge receipt of the log event before responding to the client. It strikes a balance between performance and consistency, reducing data loss risk.
- Fully Synchronous Transmission: The primary node must wait for all secondary nodes to acknowledge that they have applied the log events before responding to the client. It provides the strongest data consistency but has the highest performance overhead and latency.
-
Data Application
- Description: After the secondary node (Slave) receives the log events from the primary node, it re-executes these operations on its local database, thereby keeping the secondary node's data consistent with the primary node.
- Implementation:
- Statement-Based Replication: The secondary node receives and re-executes the original SQL statements executed on the primary node. Simple to implement, but may lead to inconsistencies due to non-deterministic functions (e.g.,
NOW(),RAND()) or triggers. - Row-Based Replication: The secondary node receives the specific before and after values of the changed data rows from the primary node. For example, an UPDATE operation transmits "for the row with id=1, age changed from 20 to 21." This method is more precise and offers better data consistency, making it the recommended approach for modern databases.
- Mixed Replication: The database dynamically chooses between statement-based or row-based replication based on the characteristics of the SQL statements, aiming to balance efficiency and accuracy.
- Statement-Based Replication: The secondary node receives and re-executes the original SQL statements executed on the primary node. Simple to implement, but may lead to inconsistencies due to non-deterministic functions (e.g.,
III. Common Replication Topologies
Different connection methods between primary and secondary nodes can be constructed based on business needs:
- Single Primary, Single Secondary: The most basic structure, used for data backup or read-write separation.
- Single Primary, Multiple Secondaries: The most common structure in production environments. One primary node corresponds to multiple secondary nodes, suitable for scenarios with more reads than writes, significantly improving read performance.
- Chained/Cascading Replication: The primary node replicates to only one secondary node (an intermediate node), which then replicates to other secondary nodes. This reduces the replication load on the primary node but adds latency through the intermediate node.
- Dual-Primary / Multi-Primary Replication: Two or more nodes act as primaries to each other, all accepting write operations. The architecture is complex and requires resolving data conflict issues. It is suitable for multi-active data center scenarios.
IV. From Replication to High Availability
The goal of high availability is to enable the system to automatically or manually quickly switch to an available secondary node when the primary node fails, minimizing service interruption time.
-
Failure Detection
- "Sentinel" processes or cluster management components deployed in multiple locations continuously send heartbeat packets to the primary node. If no response is received within a set time, the primary node is considered "subjectively down." After consultation among multiple sentinels, it is confirmed as "objectively down."
-
Primary-Secondary Switchover (Failover)
- Once the primary node is confirmed unavailable, the high availability management component initiates the failover process:
a. Select a secondary node whose data is closest to the original primary node (comparing log positions).
b. Ensure that data synchronization between this secondary node and the old primary node has completely stopped.
c. Promote this secondary node to be the new primary node.
d. Configure other secondary nodes to replicate from the new primary node.
- Once the primary node is confirmed unavailable, the high availability management component initiates the failover process:
-
Client Redirection
- Application clients need to be aware of the topology change. This is typically achieved using a Virtual IP Address (VIP) or a domain name. After completing the failover, the high availability management component binds the VIP to the new primary node or updates the DNS record. This way, clients do not need to modify their configuration; they simply need to reconnect to reach the new primary node.
V. Key Considerations and Challenges
- Replication Lag: Under asynchronous replication, the data on secondary nodes always lags behind the primary node. Applications must tolerate this lag or enforce reads from the primary node for operations requiring strong consistency.
- Data Consistency: In semi-synchronous/fully synchronous modes, a trade-off must be made between performance and data consistency.
- Failure Recovery and Data Reconciliation: When the old primary node recovers, it needs to be re-integrated into the cluster as a secondary node of the new primary node and synchronize the data missed during the downtime. This process requires careful handling.
By deeply understanding the principles of data replication and the methods for building high availability architectures, database systems can be designed to meet the requirements for data reliability and service continuity in different business scenarios.