Longest Increasing Subsequence (LIS) Problem

Longest Increasing Subsequence (LIS) Problem

The Longest Increasing Subsequence (LIS) problem is a classic problem in computer science. Given a sequence of integers, we need to find the longest (not necessarily contiguous) subsequence within it, such that the elements in this subsequence are strictly increasing. For example, for the sequence [10, 9, 2, 5, 3, 7, 101, 18], one of the longest increasing subsequences is [2, 5, 7, 101], with a length of 4.

There are two mainstream approaches to solve the LIS problem: Dynamic Programming (time complexity O(n²)) and Greedy combined with Binary Search (time complexity O(n log n)). We will focus on explaining the latter, more efficient method.

Method Two: Greedy Strategy + Binary Search (O(n log n))

The core idea of this method is: we try to make the increasing sequence grow as "slowly" as possible, so that it has greater potential to accommodate more subsequent numbers. We maintain an array tail (often named dp), where tail[i] represents the minimum ending element value among all increasing subsequences of length i+1.

Detailed Steps:

  1. Initialization:

    • Create an empty array tail to store potential subsequence ending values.
    • Place the first number of the given sequence nums into tail. At this point, tail = [nums[0]]. This indicates that we have currently found an increasing subsequence of length 1, and the minimum value for its ending is nums[0].
  2. Traverse the Sequence:

    • Starting from the second element, process each number in the sequence nums one by one. We refer to the currently processed number as x.
  3. Comparison and Placement Strategy:

    • Compare x with the last element of the tail array (i.e., the end of the current longest subsequence).

    • Case One: If x is greater than the last element of tail.

      • This means x can be appended to the current longest increasing subsequence, forming an even longer one.
      • Operation: Directly append x to the end of the tail array.
      • Example: Suppose tail = [2, 5] (meaning the minimum ending for a subsequence of length 1 is 2, and for length 2 is 5). Now x = 7. Since 7 > 5, we add 7, getting tail = [2, 5, 7]. This means we have found an increasing subsequence of length 3: [2, 5, 7].
    • Case Two: If x is less than or equal to the last element of tail.

      • This means x cannot extend the current longest subsequence. However, it might serve as a smaller ending value to update a subsequence of a shorter length, making that length's subsequence have greater "potential" (i.e., a smaller ending value).
      • Operation: In the tail array, find the first element that is greater than or equal to x, and replace it with x. This search process is performed using binary search to ensure efficiency.
      • Example: Suppose tail = [2, 5, 7]. Now x = 3.
        • 3 is not greater than 7, so we cannot extend the length.
        • We look for the first number in tail that is >= 3, which is 5 (at index 1).
        • We replace tail[1] (the 5) with 3. Now tail = [2, 3, 7].
      • What is the significance of this operation? We haven't changed the length of the current longest subsequence (still 3), but we have reduced the ending value for subsequences of length 2 from 5 to 3. This means that in the future, any number greater than 3 but less than 7 (e.g., 4) can now be appended after [2, 3] to form a subsequence of length 3: [2, 3, 4]. This condition is more lenient than the previous requirement of being greater than 5. This embodies the greedy idea of "letting the sequence grow slower for greater potential."
  4. Obtain the Result:

    • After traversing the entire nums sequence, the length of the tail array is the length of the longest increasing subsequence.
    • Important Note: The tail array itself is not necessarily an actual longest increasing subsequence that exists in nums, but it correctly records the length of the longest increasing subsequence.

Complete Example Demonstration:

Using the sequence nums = [10, 9, 2, 5, 3, 7, 101, 18] as an example.

  1. Initialization: tail = [10] (length len=1)
  2. Process x=9:
    • 9 < 10, find the first number >=9 in tail, which is 10 (index 0).
    • Replace: tail = [9] (len=1)
  3. Process x=2:
    • 2 < 9, find the first number >=2 in tail, which is 9 (index 0).
    • Replace: tail = [2] (len=1)
  4. Process x=5:
    • 5 > 2, append to end: tail = [2, 5] (len=2)
  5. Process x=3:
    • 3 < 5, find the first number >=3 in tail, which is 5 (index 1).
    • Replace: tail = [2, 3] (len=2)
  6. Process x=7:
    • 7 > 3, append to end: tail = [2, 3, 7] (len=3)
  7. Process x=101:
    • 101 > 7, append to end: tail = [2, 3, 7, 101] (len=4)
  8. Process x=18:
    • 18 < 101, find the first number >=18 in tail, which is 101 (index 3).
    • Replace: tail = [2, 3, 7, 18] (len=4)

Traversal ends. The length of the longest increasing subsequence is 4. Although the final tail array is [2, 3, 7, 18], a valid LIS in the original sequence is [2, 5, 7, 101] or [2, 3, 7, 101]. The algorithm successfully calculates the length.

Summary:
This method, by maintaining a "maximum potential" ending value array and efficiently updating it using binary search, optimizes the time complexity from O(n²) to O(n log n). It is the preferred algorithm for solving the LIS problem.