Idempotency Design in Distributed Systems

Idempotency Design in Distributed Systems

Description:
In distributed systems, idempotency refers to the property where an operation has the same effect on the system state whether it is executed once or multiple times. For example, in scenarios such as payment or order processing, the same request might be sent to the server multiple times due to network timeout retries, client duplicate submissions, or message queue duplicate deliveries. If the service lacks idempotency, it may lead to serious issues like duplicate deductions or duplicate order creation. Therefore, idempotency design is one of the core means to ensure system reliability.

Solution Process:

  1. Understanding the Nature of Idempotency:

    • Idempotency focuses on the certainty of the operation's result, not the process. For example:
      • Idempotent Operations: HTTP GET (query data), PUT (overwrite update), DELETE (delete resource). Multiple calls will not produce additional side effects.
      • Non-Idempotent Operations: HTTP POST (add resource), transfer operations. Each call may change the system state.
    • Design Goal: Transform non-idempotent operations into idempotent operations through technical means.
  2. Common Scenarios Requiring Idempotency:

    • Network Retries: The client automatically resends requests when no response is received.
    • Message Queue Duplicate Consumption: For example, Kafka's At-Least-Once delivery semantics may cause consumers to process messages repeatedly.
    • User Interface Duplicate Submissions: Users clicking the submit button multiple times.
  3. Idempotency Implementation Solutions:

    • Solution 1: Unique Identifier (Token Mechanism)

      • Steps:
        1. Before initiating a request, the client applies to the server for a unique token (e.g., UUID). The server stores the token in a cache with a short expiration time.
        2. The client initiates the business request carrying the token.
        3. The server checks if the token exists:
        • If it exists, execute the business logic and delete the token to ensure it is valid only once.
        • If it does not exist, it indicates the request has already been processed, and the original result is returned directly.
      • Applicable Scenarios: Frontend form submissions, short-term operations (e.g., payments).
    • Solution 2: Database Unique Constraint

      • Steps:
        1. Generate a unique key for the business request (e.g., order ID + business type).
        2. Before executing the business, insert this key into a deduplication table (or a unique index in the business table).
        3. If the insertion succeeds, proceed with the business logic; if it fails (unique conflict), it indicates the request has already been processed.
      • Applicable Scenarios: Database-driven businesses (e.g., order creation).
    • Solution 3: State Machine Mechanism

      • Steps:
        1. Design a status field for business data (e.g., order status: unpaid, paid).
        2. Add status conditions when updating data (e.g., UPDATE orders SET status='paid' WHERE id=123 AND status='unpaid').
        3. Determine whether it is the first operation by checking the number of rows affected by the SQL execution.
      • Applicable Scenarios: Businesses with clear state transitions (e.g., approval workflows).
    • Solution 4: Optimistic Locking (Version Number)

      • Steps:
        1. Add a version number field (version) to the data table.
        2. Specify the version number during updates: UPDATE table SET value=new_value, version=version+1 WHERE id=123 AND version=current_version.
        3. If the version number does not match, it indicates the data has been updated, and the duplicate operation is rejected.
  4. Design Considerations:

    • Idempotency Key Generation: Ensure uniqueness (e.g., by combining user ID, timestamp, and random numbers).
    • Storage Selection: Tokens can be stored in Redis (high performance), while deduplication tables require database transactions to ensure consistency.
    • Timeout Handling: Allow users to re-initiate legitimate requests after the token expires.

Summary:
The core of idempotency design is to transform non-idempotent operations into idempotent operations through unique identifiers or state control. The specific solution should be selected based on the business scenario, with a focus on the reliability and performance cost of uniqueness determination.