Kruskal's Algorithm (Minimum Spanning Tree)

Kruskal's Algorithm (Minimum Spanning Tree)

Description
Kruskal's algorithm is a greedy algorithm used to find a Minimum Spanning Tree (MST) in a weighted, undirected, connected graph. A minimum spanning tree is a subgraph of the original graph that includes all vertices, is a tree (connected and acyclic), and has the minimum possible total edge weight. The core idea of Kruskal's algorithm is: consider edges in increasing order of weight; if adding the current edge does not create a cycle, include it in the spanning tree; otherwise, skip it. This algorithm is suitable for sparse graphs (relatively few edges) and typically uses a Union-Find (Disjoint Set Union) data structure to efficiently detect cycles.

Solution Process
Assume a weighted undirected graph with a vertex set {V} and an edge set {E}, where each edge has a weight. The goal is to construct a minimum spanning tree.

Step 1: Sort Edges by Weight

  • First, sort all edges in the graph in non-decreasing order of their weights. For example, if edge weights are [5, 1, 3, 2], after sorting they become [1, 2, 3, 5]. This step ensures the algorithm prioritizes edges with smaller weights, following the greedy strategy.

Step 2: Initialize Union-Find

  • Create a Union-Find (Disjoint Set Union) data structure to manage the connected components of vertices. Initially, each vertex forms its own set (i.e., each vertex's root is itself). The Union-Find provides two operations:
    • find(x): Finds the representative (root node) of the set containing vertex x, used to determine if two vertices belong to the same set.
    • union(x, y): Merges the sets containing vertices x and y, used to connect two connected components.
  • Simultaneously, initialize an empty set (or list) called MST to store the edges of the minimum spanning tree.

Step 3: Iterate Through Edges and Check for Cycles

  • Iterate through each edge (u, v, weight) in the sorted order:
    • Use the Union-Find operations find(u) and find(v) to check if u and v have the same root:
      • If the roots are different: This means u and v are in different connected components, so adding the edge (u, v) will not create a cycle. Perform the following:
        1. Add the edge (u, v, weight) to the MST set.
        2. Call union(u, v) to merge the sets containing u and v, making them part of the same connected component.
      • If the roots are the same: This means u and v are already in the same connected component, so adding the edge (u, v) would create a cycle. Therefore, skip this edge.
  • Repeat this process until the MST contains |V| - 1 edges (since a tree always has one less edge than the number of vertices).

Step 4: Algorithm Termination and Result

  • The algorithm terminates when the MST has |V| - 1 edges. At this point, the MST is the minimum spanning tree, containing all vertices with the minimum total weight.
  • If all edges are processed and the MST still has fewer than |V| - 1 edges, it indicates that the original graph is not connected, and no minimum spanning tree exists (though problems typically assume the graph is connected).

Example
Assume a graph with 4 vertices {A, B, C, D} and the following edges with weights:

  • (A, B, 1)
  • (B, C, 2)
  • (C, D, 3)
  • (A, D, 4)
  • (B, D, 5)
  1. Sort Edges: After sorting by weight, the order is [(A,B,1), (B,C,2), (C,D,3), (A,D,4), (B,D,5)].
  2. Initialization: In the Union-Find, each vertex is its own set: {A}, {B}, {C}, {D}; MST is empty.
  3. Process Edges:
    • Edge (A,B,1): find(A) ≠ find(B), add to MST, merge {A,B} → sets become {AB}, {C}, {D}.
    • Edge (B,C,2): find(B) ≠ find(C), add to MST, merge {AB,C} → sets become {ABC}, {D}.
    • Edge (C,D,3): find(C) ≠ find(D), add to MST, merge {ABC,D} → sets become {ABCD}.
    • Now MST has 3 edges (|V|=4, enough edges), algorithm terminates.
  4. Result: MST contains edges [(A,B,1), (B,C,2), (C,D,3)] with a total weight of 6.

Key Points

  • Greedy Choice: Selecting the minimum weight edge each time ensures local optimality leads to global optimality.
  • Cycle Detection: Using Union-Find for efficient connectivity checking, with time complexity close to O(1).
  • Time Complexity: Sorting edges takes O(E log E). Union-Find operations take approximately O(E α(V)) (where α is the inverse Ackermann function, which grows extremely slowly). The overall complexity is dominated by sorting, i.e., O(E log E).