Topological Sorting

Topological Sorting

Topological sorting is a linear ordering algorithm for directed acyclic graphs. This ordered sequence must satisfy one condition: for every directed edge (u -> v) in the graph, vertex u must appear before vertex v in the sequence. Topological sorting is often used to solve problems with dependencies, such as task scheduling, course arrangement, etc.

Let's understand it step by step:

  1. Core Idea and Prerequisites

    • Dependencies: The core of topological sorting is handling dependencies. For example, before studying "Data Structures," one must first study "Programming Fundamentals." In this case, "Programming Fundamentals" is a prerequisite for "Data Structures."
    • Directed Acyclic Graph (DAG): Topological sorting can only be applied to directed acyclic graphs. If the graph contains a cycle, then the dependencies form a loop (A depends on B, B depends on C, C depends on A), making it impossible to find a linear sequence that satisfies all conditions. Therefore, topological sorting would have no solution.
  2. Algorithm Steps (Kahn's Algorithm based on BFS)
    This is an intuitive and commonly used method, whose core idea is to continuously select vertices with "in-degree" zero.

    • Step One: Understanding In-degree

      • In-degree: The number of edges pointing to a vertex. For example, if there is an edge A -> B, then the in-degree of vertex B increases by 1. A vertex with no edges pointing to it has an in-degree of 0.
    • Step Two: Initialization

      • Calculate the in-degree of each vertex in the graph and store it in an array (e.g., inDegree[]).
      • Initialize a queue to hold all vertices with an in-degree of 0.
    • Step Three: Loop Processing

      • While the queue is not empty, execute the following loop:
        1. Dequeue: Remove a vertex from the queue (let's call it currentVertex).
        2. Add to Result: Add this currentVertex to the final topological sort result sequence.
        3. Process Adjacent Vertices: Traverse all adjacent vertices pointed to by currentVertex (let's call one of them neighbor).
          • Decrease the in-degree of neighbor by 1 (because its predecessor currentVertex has been removed).
          • Check if neighbor's in-degree becomes 0 after decrementing.
          • If the in-degree becomes 0, enqueue neighbor.
    • Step Four: Check for Cycles

      • After the loop ends, check the length of the topological sort result sequence.
      • If the result sequence includes all vertices in the graph, the topological sort is successful, and this sequence is one valid solution.
      • If the length of the result sequence is less than the total number of vertices in the graph, it indicates the presence of a cycle, making topological sorting impossible.
  3. Example
    Suppose we have 4 courses with their dependencies:

    • Course IDs: 0, 1, 2, 3
    • Dependencies: Before studying 1, must study 0; before studying 2, must study 1; before studying 3, must study 1.
    • Graph representation: 0 -> 1, 1 -> 2, 1 -> 3.

    Now, let's apply Kahn's algorithm for sorting:

    • Initialization:
      • Calculate in-degrees: Vertex 0 in-degree = 0, vertex 1 in-degree = 1, vertex 2 in-degree = 1, vertex 3 in-degree = 1.
      • Queue initialization: Enqueue vertex 0 (in-degree 0). Queue = [0].
    • First Loop:
      • Dequeue: Remove vertex 0.
      • Result sequence: [0].
      • Process vertex 0's adjacent vertex (only vertex 1): Decrease vertex 1's in-degree from 1 to 0.
      • Since vertex 1's in-degree is now 0, enqueue it. Queue = [1].
    • Second Loop:
      • Dequeue: Remove vertex 1.
      • Result sequence: [0, 1].
      • Process vertex 1's adjacent vertices (vertices 2 and 3):
        • Vertex 2's in-degree decreases from 1 to 0, enqueue it.
        • Vertex 3's in-degree decreases from 1 to 0, enqueue it.
      • Queue = [2, 3].
    • Third Loop:
      • Dequeue: Remove vertex 2.
      • Result sequence: [0, 1, 2].
      • Vertex 2 has no adjacent vertices, no operation. Queue = [3].
    • Fourth Loop:
      • Dequeue: Remove vertex 3.
      • Result sequence: [0, 1, 2, 3].
      • Vertex 3 has no adjacent vertices, no operation. Queue is empty.
    • Final Check:
      • The result sequence contains all 4 vertices, sorting is successful.
      • The obtained topological sequence is [0, 1, 2, 3]. It is worth noting that [0, 1, 3, 2] is also a valid topological sequence because there is no dependency between vertices 2 and 3; they can be swapped. Therefore, the result of topological sorting may not be unique.

Through this step-by-step process, we can see how the algorithm cleverly determines the execution order of tasks by managing vertex in-degrees, ensuring all prerequisites are met.