Segment Tree

Segment Tree

Description
A segment tree is a binary tree data structure used for efficiently handling range queries and range updates on an array. It is particularly suitable for scenarios that require frequent queries and updates on interval information of an array (such as interval sum, interval maximum, interval minimum, etc.). The segment tree recursively divides the entire array interval into several sub-intervals, with each tree node storing aggregated information for a specific interval. Through this structure, the time complexity for both range queries and range updates can reach O(log n).

Basic Concepts and Structure

  1. Problem Scenario: Suppose there is an array arr, and two types of operations need to be supported: (1) Query the sum (or maximum, etc.) of interval [L, R]; (2) Increase each element in interval [L, R] by a certain value. In a naive approach, both querying and updating require O(n) time. The segment tree aims to optimize both operations to O(log n).

  2. Tree Structure Construction:

    • A segment tree is an approximate complete binary tree (usually stored in an array to save space).
    • The root node represents the entire array interval [0, n-1].
    • Each non-leaf node represents an interval [L, R], which is divided into two halves: the left child node interval is [L, mid], and the right child node interval is [mid+1, R], where mid = (L + R) // 2.
    • Leaf nodes represent intervals of length 1 (i.e., single elements).
    • Each node stores the aggregated value for its corresponding interval (e.g., interval sum).
  3. Example: For the array arr = [1, 3, 5, 7, 9, 11], the process of constructing a sum segment tree is as follows:

    • The root node represents interval [0,5] with the value 36 (the sum of the interval).
    • The left child represents [0,2] with the value 1+3+5=9; the right child represents [3,5] with the value 7+9+11=27.
    • Recursive division continues until leaf nodes (e.g., [0,0] with value 1).

Construction of Segment Tree

  1. Recursive Construction: Starting from the root node, recursively create left and right subtrees until the interval length becomes 1, at which point the leaf node value is set.
  2. Node Value Calculation: The value of a non-leaf node is aggregated from the values of its left and right child nodes (e.g., for summation, parent node value = left child value + right child value).
  3. Time Complexity: Constructing the tree requires initializing all nodes. There are approximately 2n nodes (actually 2n-1), so the construction time is O(n).

Range Query Operation

  1. Query Process: When querying the sum of interval [L, R], start from the root node recursively:
    • If the current node's interval is completely within [L, R], directly return the aggregated value stored in that node.
    • If the current node's interval does not overlap with [L, R], return 0 (for summation).
    • If the current node's interval partially overlaps with [L, R], recursively query the left and right subtrees and merge the results (e.g., add them for summation).
  2. Example: Query the sum of [1,4] (i.e., 3+5+7+9=24):
    • The root node [0,5] partially overlaps, so query both subtrees.
    • Left subtree [0,2]: Interval [1,4] partially overlaps with [0,2], continue recursion. Its right subtree [1,2] is completely within [1,4], returning 3+5=8.
    • Right subtree [3,5]: Interval [3,4] is completely within [3,5], returning 7+9=16.
    • Merge results: 8+16=24.
  3. Efficiency: Each query traverses at most two paths of the tree with a depth of O(log n), so the time complexity is O(log n).

Range Update Operation

  1. Lazy Propagation: Directly updating each leaf node is inefficient (O(n)). Introduce lazy tags (Lazy Tags):
    • When updating interval [L, R], if the current node's interval is completely within [L, R], update the node's value and set a lazy tag (indicating that child nodes need to be updated later).
    • During querying or updating, if the current node has a lazy tag, first push the tag down to its child nodes (update child node values and tags), then clear the current tag.
  2. Update Steps:
    • Check the overlap between the current node's interval and the update interval (similar to querying).
    • If completely within, update the node's value and set a lazy tag.
    • If partially overlapping, first push down the lazy tag (if needed), then recursively update the left and right subtrees, and finally update the current node's value (based on the new values of child nodes).
  3. Example: Add 2 to each element in interval [1,3]:
    • Root node [0,5] partially overlaps, push down tag (no operation), recursively update left and right subtrees.
    • Left subtree [0,2]: Partially overlaps. After recursion, update its right subtree [1,2] (completely within), increase node value by (2+2)*2=8 (interval length 2, each element increased by 2), and set a lazy tag.
    • Right subtree [3,5]: Partially overlaps. Recursively update its left subtree [3,3] (completely within), increase node value by 2, and set a tag.
    • Backtrack to update the root node value.
  4. Time Complexity: With lazy propagation, update operations are also optimized to O(log n).

Application Scenarios and Variants

  • Segment trees are suitable for problems like interval summation, interval maximum/minimum, interval assignment, etc.
  • Variants include: two-dimensional segment trees (for matrix handling), persistent segment trees (supporting historical version queries), etc.

Through segment trees, both range query and update operations are optimized to logarithmic time complexity, making them one of the core data structures for solving interval-based problems.