Principles and Implementation of Database Transactions and ACID Properties
Problem Description
Database transactions are a core mechanism in backend systems for ensuring data consistency. ACID (Atomicity, Consistency, Isolation, Durability) are the four key properties of a transaction. Interviewers will examine: 1) The specific meaning of each ACID property; 2) How databases implement these properties through mechanisms like logging and locking; 3) The challenges in distributed scenarios.
I. Basic Concepts of Transactions and ACID Definitions
-
The Nature of Transactions
A transaction is an indivisible sequence of database operations (e.g., fund transfer: deduct from Account A, deposit to Account B). These operations either all succeed (commit) or all fail (rollback). -
Detailed Explanation of ACID Properties
- Atomicity: A transaction is the smallest unit of execution; operations are either all completed or none are executed.
- Consistency: After a transaction is executed, the database must transition from one consistent state to another consistent state (e.g., the total amount remains unchanged before and after a transfer).
- Isolation: Concurrent transactions are isolated from each other and do not see intermediate states.
- Durability: After a transaction is committed, the modifications are permanently saved and will not be lost even in the event of a system failure.
II. Implementation of Atomicity: Logging Mechanism
-
Core Problem
If the system crashes halfway through transaction execution, how do we roll back the partially executed operations? -
Undo Log Process
- Step 1: Before the transaction begins, write the old value of the data to be modified (e.g., the original balance of Account A) into the Undo log.
- Step 2: Execute the data modification (deduct from Account A).
- Step 3: If the transaction fails, read the old value from the Undo log to restore the data; if successful, clean up the log.
- Key Point: The log write must be completed before the data modification (Write-Ahead Logging principle).
III. Implementation of Durability: Redo Log and Flush Strategy
-
Core Problem
After a transaction is committed, the data might still be in the memory buffer and not yet written to disk. How do we ensure data is not lost in case of a power outage? -
Redo Log Process
- Step 1: Before the transaction is committed, write all modification operations (e.g., "change Account A balance to 100") sequentially into the Redo log file.
- Step 2: Force the Redo log to be flushed to disk (fsync operation), ensuring the log is persisted.
- Step 3: The data pages in memory can be flushed to disk later. During crash recovery, replay the operations from the Redo log.
- Advantage: Sequential writing to the log is faster than random writing to data pages, and the log volume is smaller.
IV. Implementation of Isolation: Concurrency Control Techniques
-
Problem Scenarios
Concurrent transactions may lead to dirty reads (reading uncommitted data), non-repeatable reads (different results from the same query), and phantom reads (newly inserted records being seen). -
Locking Mechanism
- Shared Lock (S Lock): Applied during read operations; other transactions can read but not write.
- Exclusive Lock (X Lock): Applied during write operations; other transactions cannot read or write.
- Two-Phase Locking Protocol (2PL): A transaction must acquire all needed locks first and release them uniformly after commit. To avoid deadlocks, it requires timeout or wait-for graph detection.
-
Multi-Version Concurrency Control (MVCC)
- Principle: Maintain multiple versions for each piece of data (marked by timestamps or transaction IDs).
- Read Operations: Can only see versions that were committed before the current transaction started (snapshot read).
- Write Operations: Create new versions without affecting reads of old versions.
- Advantage: Reads do not block writes, improving concurrency performance (used by InnoDB, PostgreSQL).
V. Implementation of Consistency: Constraints and Business Logic
-
Database Layer Guarantee
- Verify data rules at commit time through unique indexes, foreign key constraints, data type checks, etc.
- Example: Check the non-negative balance constraint during a fund transfer; roll back the transaction if violated.
-
Application Layer Responsibility
- Business logic must ensure consistency (e.g., "total amount validation") and be combined with transaction boundaries.
- For example: Query the balance within a transaction before executing the transfer to avoid concurrency issues.
VI. Challenges of Distributed Transactions
-
CAP Theorem Limitation
Distributed systems cannot simultaneously guarantee Consistency, Availability, and Partition Tolerance. Trade-offs must be made based on scenarios (e.g., CP systems like ZooKeeper vs. AP systems like Cassandra). -
Two-Phase Commit (2PC)
- Phase 1 (Prepare): The coordinator asks all participants if they can commit; participants pre-write logs and lock resources.
- Phase 2 (Commit/Rollback): If all participants agree, the coordinator sends a commit command; otherwise, it sends a rollback.
- Drawbacks: Synchronous blocking, single point of failure (coordinator crash may cause resources to be locked for a long time).
-
Flexible Transaction Solutions
- TCC Pattern: Implement Try (reserve resources), Confirm (confirm), and Cancel (cancel) interfaces at the business layer, ensuring eventual consistency through retries and compensation.
- Saga Pattern: Split a long transaction into multiple local sub-transactions, each with a corresponding compensation operation; execute compensations in reverse order upon failure.
Summary
The ACID properties of transactions are implemented through collaborative technologies such as logging (Undo/Redo), locking, and MVCC. In distributed scenarios, rigid transaction solutions (e.g., 2PC) or flexible transaction solutions (e.g., TCC) should be chosen based on business characteristics, balancing consistency and performance.