Kruskal's Algorithm (Minimum Spanning Tree)
Description
Kruskal's algorithm is a greedy algorithm used to find a Minimum Spanning Tree (MST) in a weighted, undirected, connected graph. A minimum spanning tree is a subgraph of the original graph that includes all vertices, is a tree (connected and acyclic), and has the minimum possible total edge weight. The core idea of Kruskal's algorithm is: consider edges in increasing order of weight; if adding the current edge does not create a cycle, include it in the spanning tree; otherwise, skip it. This algorithm is suitable for sparse graphs (relatively few edges) and typically uses a Union-Find (Disjoint Set Union) data structure to efficiently detect cycles.
Solution Process
Assume a weighted undirected graph with a vertex set {V} and an edge set {E}, where each edge has a weight. The goal is to construct a minimum spanning tree.
Step 1: Sort Edges by Weight
- First, sort all edges in the graph in non-decreasing order of their weights. For example, if edge weights are [5, 1, 3, 2], after sorting they become [1, 2, 3, 5]. This step ensures the algorithm prioritizes edges with smaller weights, following the greedy strategy.
Step 2: Initialize Union-Find
- Create a Union-Find (Disjoint Set Union) data structure to manage the connected components of vertices. Initially, each vertex forms its own set (i.e., each vertex's root is itself). The Union-Find provides two operations:
find(x): Finds the representative (root node) of the set containing vertex x, used to determine if two vertices belong to the same set.union(x, y): Merges the sets containing vertices x and y, used to connect two connected components.
- Simultaneously, initialize an empty set (or list) called MST to store the edges of the minimum spanning tree.
Step 3: Iterate Through Edges and Check for Cycles
- Iterate through each edge (u, v, weight) in the sorted order:
- Use the Union-Find operations
find(u)andfind(v)to check if u and v have the same root:- If the roots are different: This means u and v are in different connected components, so adding the edge (u, v) will not create a cycle. Perform the following:
- Add the edge (u, v, weight) to the MST set.
- Call
union(u, v)to merge the sets containing u and v, making them part of the same connected component.
- If the roots are the same: This means u and v are already in the same connected component, so adding the edge (u, v) would create a cycle. Therefore, skip this edge.
- If the roots are different: This means u and v are in different connected components, so adding the edge (u, v) will not create a cycle. Perform the following:
- Use the Union-Find operations
- Repeat this process until the MST contains |V| - 1 edges (since a tree always has one less edge than the number of vertices).
Step 4: Algorithm Termination and Result
- The algorithm terminates when the MST has |V| - 1 edges. At this point, the MST is the minimum spanning tree, containing all vertices with the minimum total weight.
- If all edges are processed and the MST still has fewer than |V| - 1 edges, it indicates that the original graph is not connected, and no minimum spanning tree exists (though problems typically assume the graph is connected).
Example
Assume a graph with 4 vertices {A, B, C, D} and the following edges with weights:
- (A, B, 1)
- (B, C, 2)
- (C, D, 3)
- (A, D, 4)
- (B, D, 5)
- Sort Edges: After sorting by weight, the order is [(A,B,1), (B,C,2), (C,D,3), (A,D,4), (B,D,5)].
- Initialization: In the Union-Find, each vertex is its own set: {A}, {B}, {C}, {D}; MST is empty.
- Process Edges:
- Edge (A,B,1): find(A) ≠ find(B), add to MST, merge {A,B} → sets become {AB}, {C}, {D}.
- Edge (B,C,2): find(B) ≠ find(C), add to MST, merge {AB,C} → sets become {ABC}, {D}.
- Edge (C,D,3): find(C) ≠ find(D), add to MST, merge {ABC,D} → sets become {ABCD}.
- Now MST has 3 edges (|V|=4, enough edges), algorithm terminates.
- Result: MST contains edges [(A,B,1), (B,C,2), (C,D,3)] with a total weight of 6.
Key Points
- Greedy Choice: Selecting the minimum weight edge each time ensures local optimality leads to global optimality.
- Cycle Detection: Using Union-Find for efficient connectivity checking, with time complexity close to O(1).
- Time Complexity: Sorting edges takes O(E log E). Union-Find operations take approximately O(E α(V)) (where α is the inverse Ackermann function, which grows extremely slowly). The overall complexity is dominated by sorting, i.e., O(E log E).