Dynamic Programming for Solving the Longest Common Subsequence (LCS) Problem

Dynamic Programming for Solving the Longest Common Subsequence (LCS) Problem

Problem Description
The Longest Common Subsequence (LCS) problem is a classic dynamic programming problem. Given two strings (or sequences) text1 and text2, return the length of their longest common subsequence. A subsequence is a new string formed from the original string by deleting some characters (or none) without changing the relative order of the remaining characters. For example, "ace" is a subsequence of "abcde", but "aec" is not. If the two strings have no common subsequence, return 0.

Solution Approach

  1. Define State:
    Use a two-dimensional array dp[i][j] to represent the LCS length of the first i characters of text1 and the first j characters of text2. For example, dp[3][2] represents the LCS length of the first 3 characters of text1 and the first 2 characters of text2.

  2. State Transition Equation:

    • If text1[i-1] == text2[j-1] (current characters match), then the LCS length increases by 1:
      dp[i][j] = dp[i-1][j-1] + 1.
    • If the characters do not match, then the LCS length is inherited from previous optimal solutions:
      dp[i][j] = max(dp[i-1][j], dp[i][j-1]).
  3. Initialize Boundaries:
    When i=0 or j=0, it means one string is empty, so the LCS length is 0:
    dp[0][j] = 0, dp[i][0] = 0.

  4. Table Filling Order:
    Traverse the two-dimensional array row-wise or column-wise, ensuring that when calculating dp[i][j], dp[i-1][j], dp[i][j-1], and dp[i-1][j-1] have already been computed.

  5. Return Result:
    The final result is stored in dp[len(text1)][len(text2)].

Step-by-Step Example
Take text1 = "abcde", text2 = "ace" as an example:

  1. Initialize a 6×4 two-dimensional array (length plus 1 for the empty string case), with the first row and first column all set to 0.
  2. Traverse i=1 to 5 (corresponding to each character of text1) and j=1 to 3 (corresponding to each character of text2):
    • i=1, j=1: text1[0]='a' matches text2[0]='a', so dp[1][1] = dp[0][0] + 1 = 1.
    • i=1, j=2: 'a' does not match 'c', take max(dp[0][2]=0, dp[1][1]=1) = 1.
    • Continue similarly, finally dp[5][3] = 3 (corresponding to LCS "ace").

Optimization Tips

  • The state array can be compressed to one dimension to save space (requires keeping the top-left value prev).
  • If the specific LCS string needs to be output, backtrack through the dp table (by comparing characters and adjacent states).

By following these steps, the LCS problem can be solved efficiently.