Topological Sorting
Topological sorting is a linear ordering algorithm for directed acyclic graphs. This ordered sequence must satisfy one condition: for every directed edge (u -> v) in the graph, vertex u must appear before vertex v in the sequence. Topological sorting is often used to solve problems with dependencies, such as task scheduling, course arrangement, etc.
Let's understand it step by step:
-
Core Idea and Prerequisites
- Dependencies: The core of topological sorting is handling dependencies. For example, before studying "Data Structures," one must first study "Programming Fundamentals." In this case, "Programming Fundamentals" is a prerequisite for "Data Structures."
- Directed Acyclic Graph (DAG): Topological sorting can only be applied to directed acyclic graphs. If the graph contains a cycle, then the dependencies form a loop (A depends on B, B depends on C, C depends on A), making it impossible to find a linear sequence that satisfies all conditions. Therefore, topological sorting would have no solution.
-
Algorithm Steps (Kahn's Algorithm based on BFS)
This is an intuitive and commonly used method, whose core idea is to continuously select vertices with "in-degree" zero.-
Step One: Understanding In-degree
- In-degree: The number of edges pointing to a vertex. For example, if there is an edge A -> B, then the in-degree of vertex B increases by 1. A vertex with no edges pointing to it has an in-degree of 0.
-
Step Two: Initialization
- Calculate the in-degree of each vertex in the graph and store it in an array (e.g.,
inDegree[]). - Initialize a queue to hold all vertices with an in-degree of 0.
- Calculate the in-degree of each vertex in the graph and store it in an array (e.g.,
-
Step Three: Loop Processing
- While the queue is not empty, execute the following loop:
- Dequeue: Remove a vertex from the queue (let's call it
currentVertex). - Add to Result: Add this
currentVertexto the final topological sort result sequence. - Process Adjacent Vertices: Traverse all adjacent vertices pointed to by
currentVertex(let's call one of themneighbor).- Decrease the in-degree of
neighborby 1 (because its predecessorcurrentVertexhas been removed). - Check if
neighbor's in-degree becomes 0 after decrementing. - If the in-degree becomes 0, enqueue
neighbor.
- Decrease the in-degree of
- Dequeue: Remove a vertex from the queue (let's call it
- While the queue is not empty, execute the following loop:
-
Step Four: Check for Cycles
- After the loop ends, check the length of the topological sort result sequence.
- If the result sequence includes all vertices in the graph, the topological sort is successful, and this sequence is one valid solution.
- If the length of the result sequence is less than the total number of vertices in the graph, it indicates the presence of a cycle, making topological sorting impossible.
-
-
Example
Suppose we have 4 courses with their dependencies:- Course IDs: 0, 1, 2, 3
- Dependencies: Before studying 1, must study 0; before studying 2, must study 1; before studying 3, must study 1.
- Graph representation: 0 -> 1, 1 -> 2, 1 -> 3.
Now, let's apply Kahn's algorithm for sorting:
- Initialization:
- Calculate in-degrees: Vertex 0 in-degree = 0, vertex 1 in-degree = 1, vertex 2 in-degree = 1, vertex 3 in-degree = 1.
- Queue initialization: Enqueue vertex 0 (in-degree 0). Queue = [0].
- First Loop:
- Dequeue: Remove vertex 0.
- Result sequence: [0].
- Process vertex 0's adjacent vertex (only vertex 1): Decrease vertex 1's in-degree from 1 to 0.
- Since vertex 1's in-degree is now 0, enqueue it. Queue = [1].
- Second Loop:
- Dequeue: Remove vertex 1.
- Result sequence: [0, 1].
- Process vertex 1's adjacent vertices (vertices 2 and 3):
- Vertex 2's in-degree decreases from 1 to 0, enqueue it.
- Vertex 3's in-degree decreases from 1 to 0, enqueue it.
- Queue = [2, 3].
- Third Loop:
- Dequeue: Remove vertex 2.
- Result sequence: [0, 1, 2].
- Vertex 2 has no adjacent vertices, no operation. Queue = [3].
- Fourth Loop:
- Dequeue: Remove vertex 3.
- Result sequence: [0, 1, 2, 3].
- Vertex 3 has no adjacent vertices, no operation. Queue is empty.
- Final Check:
- The result sequence contains all 4 vertices, sorting is successful.
- The obtained topological sequence is [0, 1, 2, 3]. It is worth noting that [0, 1, 3, 2] is also a valid topological sequence because there is no dependency between vertices 2 and 3; they can be swapped. Therefore, the result of topological sorting may not be unique.
Through this step-by-step process, we can see how the algorithm cleverly determines the execution order of tasks by managing vertex in-degrees, ensuring all prerequisites are met.