LFU Cache Mechanism

LFU Cache Mechanism

Problem Description
LFU (Least Frequently Used) is a cache eviction policy. When the cache reaches its capacity limit, it evicts the least frequently used data item. Here, "frequency of use" refers to the number of times a data item has been accessed (via get) or updated (via put). If multiple items share the same lowest usage frequency, the one that has been unused for the longest time should be evicted. Please design and implement an LFU cache class that supports get and put operations with O(1) time complexity.

Solution Process

  1. Core Idea Analysis

    • To achieve O(1) get and put operations, we need to quickly find the corresponding value and frequency information via a key. Therefore, a hash table key -> (value, frequency) is fundamental.
    • The eviction policy involves two dimensions: frequency and time. When eviction is needed, we must find the item with the lowest frequency. If multiple items share the same frequency, evict the least recently used one (LRU rule).
    • We can organize keys with the same frequency into a doubly linked list (similar to the list in an LRU cache), so each frequency corresponds to a list. The head of the list is the most recently accessed node, and the tail is the least recently accessed node.
    • To quickly find the list corresponding to the minimum frequency, we need to maintain a variable minFreq to record the current minimum frequency.
    • We also need a hash table frequency -> linked list to quickly locate the corresponding list via frequency.
  2. Data Structure Design

    • keyToNode: A hash table mapping keys to nodes (where a node contains value, frequency, key).
    • freqToDict: A hash table mapping frequencies to a doubly linked list (or OrderedDict, used to maintain insertion order).
    • minFreq: An integer recording the current minimum frequency.
    • capacity: An integer representing the cache capacity.
  3. get Operation Procedure

    • If the key does not exist in keyToNode, return -1.
    • If the key exists:
      a. Obtain the corresponding node.
      b. Remove this node from the list in freqToDict[node.freq]. If the list becomes empty after removal and the current frequency node.freq equals minFreq, increment minFreq by 1.
      c. Increase the node's frequency node.freq by 1.
      d. Add the node to the head of the list in freqToDict[node.freq].
      e. Return the node's value.
  4. put Operation Procedure

    • If capacity is 0, return immediately.
    • If the key already exists:
      a. Update the key's corresponding value (process similar to get: first remove, increment frequency, then add to the head of the new frequency list).
    • If the key does not exist:
      a. If the cache is full (i.e., len(keyToNode) == capacity):
      1. Find the node to be evicted: at the tail of the list in freqToDict[minFreq] (the least recently used).
      2. Remove this node from the list in freqToDict[minFreq] and from keyToNode.
        b. Create a new node (key, value, frequency=1).
        c. Add the new node to the head of the list in freqToDict[1].
        d. Add the new node to keyToNode.
        e. Reset minFreq to 1 (because newly inserted nodes always have frequency 1).
  5. Key Points Explanation

    • Why do we only increase minFreq after removing a node if the list becomes empty and the frequency equals minFreq?
      Because minFreq represents the current minimum frequency present. If the list corresponding to this frequency becomes empty, it means there are no longer any nodes in the cache with frequency minFreq, so the minimum frequency naturally becomes minFreq+1.
    • Why set minFreq to 1 after inserting a new node?
      Because new nodes have frequency 1, which is necessarily the current minimum frequency.

With the above design, both get and put operations achieve O(1) time complexity, as all operations (hash table lookup, insertion, deletion, and list insertion, deletion) are O(1).