Minimum Spanning Tree Algorithms
A Minimum Spanning Tree (MST) is a classic problem in graph theory. Given a connected undirected graph, the minimum spanning tree is a subgraph of the original graph that is a tree, contains all vertices, and has the minimum possible total edge weight sum. This problem has important applications in network design, circuit routing, and other fields.
Problem Description
Assume there is a connected undirected graph G=(V, E), where V is the vertex set and E is the edge set. Each edge (u, v) has a weight w(u, v). Our goal is to find a subset of edges T ⊆ E such that:
- T connects all vertices (i.e., all vertices are connected in the subgraph formed by T).
- T does not contain any cycles (i.e., T is a tree).
- The sum of the weights of all edges in T, ∑w(u, v), is minimized.
Core Idea and Key Property
Algorithms for solving the MST problem are based on a key property: the Cut Property.
- Cut: Partition the vertex set V arbitrarily into two disjoint subsets S and V-S.
- Edge crossing a cut: An edge whose endpoints lie in S and V-S respectively is said to cross the cut (S, V-S).
- Cut Property: For any cut of the graph, the minimum weight edge crossing that cut must be part of some minimum spanning tree of the graph.
This property is the foundation of greedy algorithms. We can build the global optimal MST step by step by locally selecting the "currently best" edge (i.e., the minimum weight edge crossing some cut).
Two famous algorithms are based on this idea but differ in how they construct the cut: Prim's algorithm and Kruskal's algorithm.
Prim's Algorithm
Prim's algorithm starts from a single vertex and gradually "grows" a spanning tree. At each step, it adds the minimum weight edge that connects the set of visited vertices to the set of unvisited vertices.
Algorithm Steps
-
Initialization:
- Create a set
mstSet(or use a boolean arrayinMST) to track which vertices are already included in the MST. Initially, no vertices are in the set. - Create an array
key[], used to record the minimum weight needed to connect each vertex to the current MST. Initially, set allkeyvalues to infinity (∞), indicating no connecting edge has been found. - Create an array
parent[], used to record the parent of each vertex in the MST, so that the final tree can be constructed. - Choose any vertex as the starting point (e.g., vertex 0). Set its
keyvalue to 0, as it will be the root of the tree. Set itsparentto -1 (indicating no parent).
- Create a set
-
Process all vertices in a loop:
WhilemstSetdoes not contain all vertices, repeat the following steps:
a. Select the next vertex: From the vertices not yet inmstSet, select the vertexuwith the smallestkeyvalue. In the first iteration, this is naturally the starting vertex withkey=0.
b. Add vertex to MST: Add vertexutomstSet, meaning it is now part of the MST.
c. Update key values of adjacent vertices: For each adjacent vertexvofuthat is not yet visited (not inmstSet):
* If the weight of edge(u, v)is less than the currentkeyvalue ofv, then:
* Updatev'skeyvalue to the weight of edge(u, v).
* Updatev'sparenttou. This indicates that under the current plan, the best edge to connectvto the MST is(u, v). -
Output Result:
- Finally, the
parent[]array defines an MST. For each vertex i (i ≠ root),(parent[i], i)is an edge in the MST.
- Finally, the
Illustrative Process (Simplified)
Assume a graph with vertices A, B, C, D, and edges with weights: A-B:1, A-C:4, B-C:2, B-D:6, C-D:3.
- Initialization: Choose A as start,
key[A]=0,key[B]=key[C]=key[D]=∞. - Step 1: Select A (smallest
key). Update its neighbors B and C:key[B] = min(∞, 1) = 1,parent[B]=A;key[C] = min(∞, 4) = 4,parent[C]=A. - Step 2: From {B, C, D}, select B (
key=1). Add B to MST. Update B's neighbors C and D: For C, edge weight 2 <key[C](4), so updatekey[C]=2,parent[C]=B; For D, updatekey[D]=6,parent[D]=B. - Step 3: From {C, D}, select C (
key=2). Add C to MST. Update C's neighbor D: edge weight 3 <key[D](6), so updatekey[D]=3,parent[D]=C. - Step 4: Select D (
key=3). Add D to MST. All vertices processed. - Final MST edges: (A,B), (B,C), (C,D). Total weight: 1+2+3=6.
Time Complexity:
- Using an adjacency matrix and simple linear search to find the minimum key vertex: O(V²).
- Using an adjacency list and a min-heap (priority queue) to efficiently extract the minimum key vertex and update key values: O(E log V).
Kruskal's Algorithm
Kruskal's algorithm does not start from a single vertex. Instead, it considers all edges, sorted by weight, and tries to add them to the growing forest one by one, adding an edge only if it does not create a cycle.
Algorithm Steps
-
Initialization:
- Sort all edges of the graph in non-decreasing order of their weight.
- Create a Union-Find (Disjoint Set Union) data structure. Initially, each vertex is its own set.
- Initialize an empty set
MSTto store the final selected edges.
-
Process sorted edges:
Traverse each edge(u, v)in sorted order:- Use the Union-Find structure to check if vertices
uandvbelong to the same set (i.e., are already connected). - If they are in different sets: Adding edge
(u, v)will not create a cycle. Add this edge to theMSTset and perform a union operation to merge the sets containinguandv. - If they are in the same set:
uandvare already connected via other edges, adding(u, v)would create a cycle, so skip this edge.
- Use the Union-Find structure to check if vertices
-
Termination:
- The algorithm can terminate early when the number of edges in
MSTreaches |V| - 1, as the MST is complete. - Or, it continues until all edges are processed.
- The algorithm can terminate early when the number of edges in
Illustrative Process (Simplified)
Using the same example graph: edges A-B:1, A-C:4, B-C:2, B-D:6, C-D:3.
- Sorted edges: A-B(1), B-C(2), C-D(3), A-C(4), B-D(6).
- Process A-B(1): A and B are in different sets. Add to MST. Merge sets {A}, {B} into {A,B}.
- Process B-C(2): B in {A,B}, C in {C}. Different sets. Add to MST. Merge into {A,B,C}.
- Process C-D(3): C in {A,B,C}, D in {D}. Different sets. Add to MST. Merge into {A,B,C,D}. Now MST has 3 edges (|V|=4, 3=4-1). Algorithm ends.
- Final MST edges: A-B, B-C, C-D. Total weight: 6.
Time Complexity:
- Sorting edges takes O(E log E) time. Since E is at most O(V²), O(E log E) is equivalent to O(E log V).
- Union-Find operations (find and union) with path compression and union by rank are nearly constant time, O(α(V)), where α is the inverse Ackermann function, which grows extremely slowly.
- Therefore, the total time complexity is O(E log E) or O(E log V), dominated by the sorting step.
Summary and Comparison
| Feature | Prim's Algorithm | Kruskal's Algorithm |
|---|---|---|
| Core Idea | Grows a single tree from a starting vertex. | Builds the MST by merging forests, processing edges in sorted order. |
| Suitable Graph Type | More efficient for dense graphs (where E is close to V²). | More efficient for sparse graphs (where E is much less than V²). |
| Data Structures | Typically uses a priority queue (min-heap). | Requires sorting edges and uses a Union-Find structure. |
| Time Complexity | O(E log V) (with binary heap and adjacency list) | O(E log E) (dominated by sorting) |
Both algorithms are excellent examples of greedy algorithms, whose correctness is guaranteed by the cut property. In practice, the choice depends on the density of the graph.