Database Read-Write Separation: Principles and Practice

Database Read-Write Separation: Principles and Practice

Problem Description:
Read-write separation is a common database architecture optimization technique. It distributes database read and write operations to different server nodes to improve overall system performance and scalability. Core questions include: How to implement read-write separation? How to handle data synchronization latency between master and slave nodes? In which scenarios is read-write separation applicable?

Solution Process:

Basic Principles of Read-Write Separation
- Goal: Direct write operations (INSERT/UPDATE/DELETE) to the master node (Master) and read operations (SELECT) to one or more slave nodes (Slave), thereby distributing the load off a single node.
- Dependent Technology: Data synchronization is achieved based on master-slave replication (e.g., MySQL's binlog replication). The master node synchronizes data changes to slave nodes.
- Key Components:
  - Load Balancer/Middleware (e.g., ShardingSphere, MySQL Router): Parses SQL statements and routes requests based on the operation type.
  - Data Synchronization Channel: Ensures slave node data is eventually consistent with the master node (though latency may exist).
Implementation Steps for Read-Write Separation
- Step 1: Set Up Master-Slave Replication Environment
  - Enable binlog on the master node; configure slave nodes with master node information and start replication threads.
  - Verify Synchronization: Insert data on the master node and check if slave nodes are updated promptly.
- Step 2: Introduce Routing Middleware
  - Configure data sources, specifying the addresses of master and slave nodes.
  - The middleware automatically routes requests by analyzing SQL syntax (e.g., identifying SELECT as a read operation).
  - Example: Write requests are sent to the master database; read requests are distributed in a round-robin manner to multiple slave databases for load balancing.
- Step 3: Handle Special Scenarios
  - Forcing Reads from Master: Some services require reading the latest data (e.g., querying an order after payment). It's necessary to force reads from the master via hints or configuration.
  - Reads and Writes in Transactions: To avoid data inconsistency, all operations within a transaction are typically routed to the master database.
Strategies to Handle Data Synchronization Latency
- Root Cause: Master-slave replication is asynchronous or semi-synchronous. Slave node latency may lead to reading stale data.
- Solutions:
  - Fallback to Reading from Master: For queries requiring high consistency, temporarily switch to reading from the master (marked in code with /*master*/).
  - Latency Monitoring and Alerts: Monitor the slave node's Seconds_Behind_Master parameter and trigger alerts when latency is too high.
  - Semi-Synchronous Replication: Ensures the master node commits a transaction only after at least one slave node has received the data, but this may reduce write performance.
Applicable Scenarios and Limitations
- Applicable Scenarios: Read-heavy, write-light businesses (e.g., e-commerce product browsing, news websites); read load can be scaled horizontally by adding slave nodes.
- Inapplicable Scenarios:
  - Write-intensive businesses (the master node remains a bottleneck).
  - Read scenarios requiring strong consistency (e.g., bank balance inquiries).
- Note: Read-write separation "scales reads" rather than "scales writes". To scale writes, consider database and table sharding.
Practical Cases and Optimization Techniques
- Case: The read volume of a social media platform's feed page far exceeds the publish volume. Through read-write separation, QPS increased from 5k to 20k.
- Optimization Techniques:
  - Configure differentiated indexes on slave nodes (e.g., optimized for report queries).
  - Use multi-threaded replication (MySQL 5.7+) to reduce synchronization latency.
  - Regularly check master-slave data consistency (using tools like pt-table-checksum).

Through the above steps, read-write separation can effectively improve database read performance. However, it must be designed in conjunction with business characteristics and include consistency compensation mechanisms.