Detailed Explanation of the LRU Cache Eviction Algorithm

Detailed Explanation of the LRU Cache Eviction Algorithm

Problem Description
The LRU (Least Recently Used) cache eviction algorithm is a commonly used cache management strategy. When cache space is insufficient, it prioritizes evicting the data that has not been accessed for the longest time. Please provide a detailed explanation of the core concept, data structure design, operational workflow, and time complexity analysis of the LRU algorithm.

Core Concept and Scenario
Imagine a library with limited shelf space and new books constantly being acquired. The librarian needs to decide which old books to remove: books that have been borrowed recently are likely more popular and should be kept, while books untouched for the longest time might be cleared first. This is the idea behind LRU—based on the principle of temporal locality of reference, it assumes that data accessed recently is more likely to be accessed again in the near future.

Data Structure Design
An efficient LRU implementation requires combining two data structures:

  1. Doubly Linked List: Stores cache entries, with nodes ordered by access time. The most recently accessed node is at the head, and the least recently accessed is at the tail. A doubly linked list supports inserting or deleting a node at any position in O(1) time.
  2. Hash Table: Uses the cache entry's key as the key and a pointer to the corresponding node in the linked list as the value. It allows rapid node lookup by key, achieving O(1) time queries.

Step-by-Step Operational Workflow Analysis
Assume the cache capacity is 3, with an initial state as follows (the list head represents the most recent access):

Linked List: Empty
Hash Table: Empty

Step 1: Inserting Data (put operation)

  • Insert key-value pair (1, A)

    1. Create a new node [key=1, value=A]
    2. Insert the node at the head of the linked list: List → [1,A]
    3. Record the mapping in the hash table: {1 → node address}
      Result: Linked List [1,A] (head), Hash Table {1: node}
  • Insert (2, B) and (3, C)
    Insert them sequentially at the head. The list becomes [3,C] ↔ [2,B] ↔ [1,A] (tail), and the hash table becomes {1: node1, 2: node2, 3: node3}.

Step 2: Accessing Data (get operation)

  • Query key=2 (get(2))
    1. Locate the address of node [2,B] via the hash table.
    2. Remove the node from its current position in the linked list (disconnect its forward and backward links).
    3. Re-insert it at the head of the linked list.
      Result: The linked list becomes [2,B] ↔ [3,C] ↔ [1,A], indicating key=2 is the most recently accessed.

Step 3: Cache Eviction (when capacity is insufficient)

  • Insert new data (4, D). The cache is now full (capacity=3).
    1. Locate the tail node of the linked list, [1,A] (the least recently accessed).
    2. Remove this node and simultaneously delete key=1 from the hash table.
    3. Insert the new node [4,D] at the head of the linked list.
      Result: Linked List [4,D] ↔ [2,B] ↔ [3,C], Hash Table {2: node2, 3: node3, 4: node4}.

Key Details

  • Necessity of the Doubly Linked List: When deleting a node, its predecessor node must be known to update the links. A singly linked list cannot directly obtain the predecessor.
  • Thread Safety Considerations: In a multi-threaded environment, locking is required, but it may become a performance bottleneck. Industrial-grade implementations (such as Java's LinkedHashMap) provide concurrency control options.

Time Complexity Analysis

  • get operation: Hash table lookup O(1) + Linked list move O(1) = O(1)
  • put operation: Hash table insertion O(1) + Linked list head insertion O(1) + (if eviction occurs) tail deletion O(1) = O(1)
    All operations guarantee constant time complexity, meeting the requirements for high-efficiency caching.

Practical Application Scenarios

  • Database buffer pools (e.g., InnoDB Buffer Pool)
  • One of the memory eviction policy options in Redis
  • CPU cache replacement policies
  • Browser local storage management

Through the steps above, the LRU algorithm transforms "recently used" into a linked list order, using a hash table to ensure fast access, thereby achieving efficient cache management.