Concurrency Control and Conflict Resolution Mechanisms in Database Transactions
Problem Description
In high-concurrency database systems, multiple transactions may access the same data resources simultaneously. Without proper control, this can lead to data inconsistency issues. Common concurrency conflicts include dirty reads, non-repeatable reads, and phantom reads. This problem requires mastering how to detect and resolve these conflicts through concurrency control mechanisms (such as locking, timestamping, multi-version concurrency control, etc.) to ensure transaction isolation and data consistency.
1. Types and Causes of Concurrency Conflicts
1.1 Dirty Read
- Problem Description: Transaction A reads uncommitted modifications made by Transaction B. Later, Transaction B rolls back, causing Transaction A to read invalid data.
- Example:
- Transaction B changes an account balance from 100 to 200 (uncommitted).
- Transaction A reads the balance as 200.
- Transaction B rolls back, restoring the balance to 100. Transaction A's read result is incorrect.
1.2 Non-Repeatable Read
- Problem Description: Transaction A reads the same data multiple times. During this period, Transaction B modifies and commits the data, resulting in inconsistent read results for Transaction A.
- Example:
- Transaction A reads the balance for the first time as 100.
- Transaction B changes the balance to 200 and commits.
- Transaction A reads the balance again as 200, which is inconsistent with the first result.
1.3 Phantom Read
- Problem Description: Transaction A repeatedly queries a set of records matching a certain condition. During this period, Transaction B inserts or deletes records that match the condition and commits, causing Transaction A's two queries to return different numbers of records.
- Example:
- Transaction A queries users under 30 years old and returns 10 records.
- Transaction B inserts a new user aged 25 and commits.
- Transaction A queries again and returns 11 records.
2. Core Mechanisms of Concurrency Control
2.1 Locking Mechanism
- Basic Idea: A transaction must acquire a lock before accessing data; other transactions must wait for the lock to be released.
- Types of Locks:
- Shared Lock (S Lock): Used for read operations, allowing multiple transactions to read the same data simultaneously.
- Exclusive Lock (X Lock): Used for write operations, allowing only one transaction to exclusively access the data.
- Lock Granularity: Row locks, table locks, page locks, etc. (Refer to the previously covered topic "Database Lock Granularity and Types").
2.2 Two-Phase Locking Protocol (2PL)
- Phase 1: Growing Phase
The transaction gradually requests all required locks during execution but cannot release any locks. - Phase 2: Shrinking Phase
The transaction releases locks and cannot request new ones. - Purpose: Ensures serializable scheduling of conflicting transactions but may lead to deadlocks (requiring deadlock detection mechanisms).
2.3 Timestamp Ordering
- Core Rules:
- Assign a unique timestamp (e.g., start time) to each transaction.
- Data items record the last read/write timestamps.
- If a transaction's read/write operation timestamp is earlier than the last write timestamp of the data item, the operation is rejected, and the transaction is rolled back.
- Example:
- The last write timestamp for data X is T2.
- Transaction T1 (timestamp T1 < T2) attempts to write to X and is rejected.
2.4 Multi-Version Concurrency Control (MVCC)
- Principle: Maintains multiple versions of data items. Read operations access old versions, while write operations create new versions (refer to the previously covered topic "MVCC Implementation Principles").
- Advantage: Reads do not block writes, and writes do not block reads.
3. Specific Strategies for Conflict Resolution
3.1 Conflict Handling in Locking Mechanisms
- Scenario: Transactions T1 and T2 simultaneously request an X lock on the same data.
- Steps:
- The transaction that requests the lock first acquires it; the later requestor enters a wait queue.
- If a timeout or deadlock occurs, one of the transactions is rolled back (e.g., based on timestamp or lock wait graph detection).
3.2 Conflict Handling in MVCC
- Write-Write Conflict:
- Transactions T1 and T2 simultaneously modify the same data row.
- The system checks whether the current version of the data row has already been committed:
- If not committed, the later-committing transaction is rolled back (e.g., PostgreSQL's SSI isolation level).
- If already committed, the later-committing transaction must re-execute the modification (optimistic locking mechanism).
- Read-Write Conflict:
- Read operations access snapshot versions, while write operations generate new versions. The two do not block each other.
3.3 Optimistic Concurrency Control (OCC)
- Three Phases:
- Read Phase: The transaction reads data and caches modifications.
- Validation Phase: Before committing, checks whether the data has been modified by other transactions.
- Write Phase: If validation passes, commits the modifications; otherwise, rolls back.
- Applicable Scenarios: Systems with frequent reads, infrequent writes, and low conflict probability.
4. Practical Case: Concurrency Control in Bank Transfer Scenario
4.1 Problem Description
Transaction A (transferring 100 units) and Transaction B (querying balance) execute concurrently. Dirty reads and non-repeatable reads must be avoided.
4.2 Solution (Based on Locking + Isolation Level)
- Set Isolation Level to Repeatable Read:
- Transaction B's read operation requires an S lock; Transaction A's write operation requires an X lock.
- Execution Flow:
- Transaction B acquires an S lock and reads the balance (e.g., 100 units).
- Transaction A's request for an X lock is blocked (because the S lock is not released).
- Transaction B reads the balance again (still 100 units), ensuring repeatable reads.
- Transaction B commits and releases the S lock.
- Transaction A acquires the X lock, completes the transfer (balance becomes 0), commits, and releases the lock.
4.3 Exception Handling
- If Transaction A times out while waiting for the lock, the transfer operation is automatically rolled back.
- If a deadlock occurs, the system chooses to roll back the transaction with lower priority (e.g., the transaction occupying fewer resources).
5. Summary
- Lower Isolation Levels (e.g., Read Uncommitted): High performance but high risk, suitable for read-only statistical analysis.
- Higher Isolation Levels (e.g., Serializable): High safety but low concurrency; require index optimization to reduce lock contention.
- Modern Databases (e.g., MySQL/PostgreSQL): Typically use a mix of locking, MVCC, and optimistic control, dynamically adjusting strategies based on scenarios.
Through the above mechanisms, database systems can balance performance and consistency in high-concurrency environments, ensuring the safe execution of transactions.