LRU Cache Mechanism

LRU Cache Mechanism

Problem Description
The LRU (Least Recently Used) cache mechanism is a commonly used cache eviction policy. When the cache space is insufficient, it prioritizes evicting the data that has not been accessed for the longest time. Please design and implement a data structure that satisfies the LRU constraint. This structure needs to support the following two operations:

get(key): If the key key exists in the cache, return its corresponding value and mark this data entry as recently used; otherwise, return -1.
put(key, value): If the key key already exists, update its data value value and mark it as recently used; if it does not exist, insert this key-value pair into the cache. If the insertion operation causes the number of cache entries to exceed the capacity capacity, the least recently used data entry should be evicted.

Analysis of Solution Approach
To implement an LRU cache, we need to solve two core problems:

Fast Lookup: Quickly find the corresponding value based on the key. A Hash Table can perform lookups in O(1) time.
Maintaining Access Order: Maintain the access time order of all data entries to quickly identify the least recently used entry. Linked lists can efficiently adjust node order, but deleting a node in a singly linked list requires traversing to find its predecessor. A Doubly Linked List supports O(1) time complexity node deletion (when the node pointer is known).

Therefore, we adopt a Hash Table + Doubly Linked List structure:

The hash table uses the key as its key to store a pointer to the corresponding linked list node.
The doubly linked list sorts nodes by access time: recently accessed nodes are near the head of the list, and the least recently accessed nodes are near the tail.

Implementation Steps

Define the Doubly Linked List Node
Each node contains four fields: key (for reverse mapping to the hash table), value (the stored data), prev (predecessor pointer), next (successor pointer).
Initialize the LRU Cache

Set the cache capacity capacity.
Create a dummy head node and a dummy tail node, and make them point to each other. This avoids boundary checks during actual node operations.
Initialize an empty hash table.

Implement the get Operation

Search for the key in the hash table:
- If it does not exist, return -1.
- If it exists, find the corresponding linked list node via the hash table, return its value, and move this node to the head of the list (marking it as recently used).

Implement the put Operation

Search for the key in the hash table:
- If it exists: Update the node's value and move this node to the head of the list.
- If it does not exist:
  - Create a new node, insert it at the head of the list, and store the key and node pointer in the hash table.
  - If the cache size exceeds capacity after insertion, then delete the node at the tail of the list (the least recently used) and remove its corresponding key from the hash table.

Helper Method: Move Node to Head

Remove the node from the linked list (by modifying the pointers of its neighboring nodes).
Insert this node right after the dummy head node.

Helper Method: Delete Tail Node

Find the node to be deleted via the predecessor pointer of the dummy tail node.
Remove this node from the list and delete its key from the hash table.

Key Points Explanation

Using dummy head and tail nodes simplifies linked list operations, avoiding special cases for empty lists or single-node lists.
When deleting a node in a doubly linked list, it is necessary to modify both the next pointer of its predecessor node and the prev pointer of its successor node.
Each time an existing node is accessed (via get or put), it must be moved to the head of the list to ensure the list order reflects the access time.

Complexity Analysis

Time Complexity: Both get and put operations are O(1).
Space Complexity: O(capacity), occupied by the hash table and the doubly linked list.

By combining the fast lookup of a hash table with the order maintenance of a doubly linked list, the LRU cache mechanism can efficiently satisfy both access order constraints and performance requirements.