Unique ID Generation Solutions in Distributed Systems

Unique ID Generation Solutions in Distributed Systems

Description: In distributed systems, generating globally unique IDs is a fundamental and critical requirement. For example, in e-commerce, social networks, or large-scale applications, entities such as orders, user posts, and messages need to be assigned an identifier that never repeats in a distributed environment. This ID generation service must meet requirements such as high availability, low latency, high QPS (Queries Per Second), and the generated IDs themselves should possess certain readability (e.g., roughly increasing over time). Directly using the database's auto-increment primary key is not feasible in a distributed environment due to single-point bottlenecks and performance issues.

Problem-Solving Process:

Step 1: Clarify Core Requirements and Challenges

Before designing a solution, we must first clarify what conditions an excellent distributed ID generator needs to meet:

  1. Global Uniqueness: This is the most basic requirement. It must guarantee that IDs generated at any time and on any machine are unique.
  2. High Availability: The ID generation service must be highly available, as it is a foundational dependency for many business operations and cannot have single points of failure.
  3. Low Latency: The operation of generating an ID should be very fast and should not become a performance bottleneck for the system.
  4. High QPS: The system needs to withstand extremely high concurrent requests.
  5. Roughly Monotonic Increasing: It is preferable for generated IDs to be roughly increasing over time. This benefits database index performance (e.g., InnoDB's B+ tree index favors ordered inserts) and facilitates sorting by time.
  6. Readability/Information Content: It is beneficial for IDs to contain meaningful information, such as timestamps or worker machine IDs, to facilitate troubleshooting.

Challenge: How to efficiently generate IDs that meet all the above conditions on multiple machines without central coordination.

Step 2: Analyze Common Solutions and Their Evolution

We will analyze several mainstream solutions, progressing from simple to complex.

Solution 1: UUID (Universally Unique Identifier)

  • Description: A UUID is a 128-bit number generated by standard algorithms (e.g., based on MAC address, timestamp, random numbers, etc.), ensuring a very high probability of uniqueness.
  • Advantages:
    • Simple Implementation: Nearly every programming language has ready-to-use libraries.
    • Extremely High Performance: Generated locally with no network overhead and no dependency on any central service.
    • Guaranteed Uniqueness: The probability of collision is extremely low in theory.
  • Disadvantages:
    • Not Human-Readable: Generates a long random string (e.g., 550e8400-e29b-41d4-a716-446655440000), lacking the property of being roughly increasing.
    • Database Unfriendly: When used as a database primary key, unordered IDs cause frequent node splits and movements in B+ tree indexes during insert operations, severely impacting write performance.
  • Conclusion: Suitable for scenarios with low storage and performance requirements, but usually not the optimal choice for high-concurrency, large-data-volume distributed systems.

Solution 2: Database Auto-increment ID (Single-Machine and Multi-Machine Modes)

  • Description: Utilizes the auto-increment primary key feature of a database table.
    • Single Database: All requests access the same database to obtain IDs via AUTO_INCREMENT.
      • Disadvantages: Single point of failure, obvious performance bottleneck, unable to meet high concurrency.
    • Database Cluster (Setting Different Increment Steps): For example, deploy two database instances. DB1's ID sequence is 1, 3, 5, 7..., and DB2's ID sequence is 2, 4, 6, 8... Conflicts are avoided by setting different starting values and increment steps.
      • Disadvantages: Poor scalability, adding nodes requires reconfiguring steps; the database itself becomes the system bottleneck, as ID generation performance is limited by the database's write capability.
  • Conclusion: Simple but with limited scalability and performance, not suitable for very large-scale systems.

Solution 3: Redis-based ID Generation

  • Description: Utilizes Redis's single-threaded atomic operations INCR or INCRBY commands to generate incremental IDs.
  • Advantages: Much better performance than a database.
  • Disadvantages: Requires introducing and maintaining a Redis cluster and addressing its persistence and data consistency issues. Similarly suffers from dependency on a centralized service.

Solution 4: Snowflake and Its Variants

  • Description: This is a distributed ID generation algorithm open-sourced by Twitter. Its core idea is to partition a 64-bit long integer ID into several parts.
    • ID Structure:
      • 1-bit sign bit: Fixed as 0, representing a positive number.
      • 41-bit timestamp: Records the number of milliseconds from a custom epoch (e.g., 2020-01-01) to the current time. It can be calculated using (1L << 41) / (1000L * 60 * 60 * 24 * 365), providing approximately 69 years of availability.
      • 10-bit worker machine ID: Used to identify different machines. Typically divided into a 5-bit high part for datacenter_id and a 5-bit low part for worker_id, allowing a maximum of 2^10 = 1024 nodes.
      • 12-bit sequence number: Sequence number for different IDs generated within the same millisecond. This means a single machine can generate up to 2^12 = 4096 IDs per millisecond.
  • Workflow:
    1. When the service starts, it needs to be configured with its unique datacenter_id and worker_id (can be allocated via configuration centers or service discovery mechanisms like ZK).
    2. When an ID needs to be generated, the algorithm obtains the current timestamp.
    3. If the current timestamp is less than the timestamp of the last generated ID, a clock rollback has occurred, and an exception must be thrown or the algorithm must wait for the clock to catch up.
    4. If it's the same millisecond, the sequence number increments. If the sequence number is exhausted, wait until the next millisecond to continue generation.
    5. Concatenate the parts via bitwise operations to form a 64-bit long integer ID.
  • Advantages:
    • Fully Distributed: No need for a centralized service; extremely high performance (millions of IDs per second per machine).
    • Roughly Monotonic Increasing: IDs increase over time, which is friendly to database indexing.
    • Information Content: Information such as generation time and machine ID can be parsed from the ID.
  • Disadvantages:
    • Clock Rollback Issue: If a machine's clock rolls back, it may lead to duplicate ID generation. This is a key issue that the Snowflake algorithm must handle.
    • Machine ID Allocation: Requires a mechanism to manage and allocate datacenter_id and worker_id, ensuring their global uniqueness.
  • Variants and Improvements:
    • To address Snowflake's shortcomings, major companies have introduced their own variants, such as Baidu's UidGenerator and Meituan's Leaf. Notably, Leaf offers two modes:
      • Leaf-segment: A database segment-based mode. It fetches a segment of IDs (e.g., 1~1000) from the database into service memory. When issuing IDs, it increments directly in memory and fetches the next segment from the database after exhaustion. This significantly reduces database access pressure.
      • Leaf-snowflake: Based on the native Snowflake algorithm, it allocates worker_id using ZooKeeper sequential nodes and solves the clock rollback issue.

Step 3: Solution Comparison and Selection Recommendations

Solution Advantages Disadvantages Applicable Scenarios
UUID Simple, high performance, decentralized Unordered, poor storage performance, not human-readable Small-scale applications, internal systems with low performance requirements
DB Auto-increment Simple, IDs are increasing Centralized bottleneck, poor scalability Small monolithic applications, systems with moderate data volume
Redis Better performance than DB Requires Redis maintenance, centralized risk Scenarios with existing Redis clusters that can accept its complexity
Snowflake/Leaf High performance, high availability, roughly increasing Requires solving clock rollback and machine ID allocation Preferred choice for large-scale, high-concurrency distributed systems (e.g., e-commerce, social networks)

Conclusion: For modern large-scale distributed systems, Snowflake and its variants (like Leaf) are the industry's most mainstream and recommended solutions. They strike a good balance between performance, availability, and functionality. In practical applications, mature open-source solutions like Leaf are often chosen, as they address most of the pain points of the native Snowflake algorithm and provide an out-of-the-box, highly available service.