Principle and Implementation of Union-Find Data Structure

Principle and Implementation of Union-Find Data Structure

Description
Union-Find (Disjoint-Set Union) is a data structure for managing partitions of elements. It supports two operations: merging (Union) two disjoint sets, and querying (Find) which set a specific element belongs to. It is widely used in algorithms such as connectivity determination and minimum spanning tree.

Core Concepts

  1. Each set is represented by a tree, with the root of the tree serving as the representative element of the set.
  2. Each node records its parent node. The parent of the root node points to itself.
  3. Determine whether two elements belong to the same set by finding their root nodes.

Basic Implementation Steps

  1. Initialization
class UnionFind:
    def __init__(self, n):
        self.parent = list(range(n))  # Each node's parent initially points to itself
  1. Find Operation (Unoptimized)
def find(self, x):
    while self.parent[x] != x:  # Loop until the root is found
        x = self.parent[x]
    return x
  1. Union Operation (Unoptimized)
def union(self, x, y):
    root_x, root_y = self.find(x), self.find(y)
    if root_x != root_y:
        self.parent[root_y] = root_x  # Attach y's root under x's root

Optimization Strategies

  1. Path Compression Optimization
    During the find process, directly connect nodes to the root to reduce tree height:
def find(self, x):
    if self.parent[x] != x:
        self.parent[x] = self.find(self.parent[x])  # Recursively compress path
    return self.parent[x]
  1. Union by Rank Optimization
    Record the depth (rank) of each tree, always merging the smaller tree into the larger one:
def __init__(self, n):
    self.parent = list(range(n))
    self.rank = [0] * n  # Record the depth of each tree

def union(self, x, y):
    root_x, root_y = self.find(x), self.find(y)
    if root_x == root_y:
        return
    # Union by rank: Merge smaller tree into larger one
    if self.rank[root_x] < self.rank[root_y]:
        self.parent[root_x] = root_y
    elif self.rank[root_x] > self.rank[root_y]:
        self.parent[root_y] = root_x
    else:
        self.parent[root_y] = root_x
        self.rank[root_x] += 1  # When depths are equal, increment depth after merge

Time Complexity Analysis

  • Unoptimized: Worst-case O(n)
  • Only path compression or union by rank: O(log n)
  • Combined optimizations: Approximately O(α(n)), where α is the inverse Ackermann function, effectively constant time in practice.

Complete Code Example

class UnionFind:
    def __init__(self, n):
        self.parent = list(range(n))
        self.rank = [0] * n
    
    def find(self, x):
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])
        return self.parent[x]
    
    def union(self, x, y):
        rx, ry = self.find(x), self.find(y)
        if rx == ry:
            return False
        if self.rank[rx] < self.rank[ry]:
            self.parent[rx] = ry
        elif self.rank[rx] > self.rank[ry]:
            self.parent[ry] = rx
        else:
            self.parent[ry] = rx
            self.rank[rx] += 1
        return True