Topological Sorting
Topological sorting is a linear ordering algorithm for directed acyclic graphs (DAGs). This ordering satisfies: for every directed edge \(u \to v\) (from node \(u\) to node \(v\)) in the graph, \(u\) appears before \(v\) in the sorted result. Topological sorting is often used to solve problems involving dependency relationships, such as task scheduling, course arrangement, etc.
Core Concepts
- Directed Acyclic Graph (DAG): Topological sorting applies only to directed graphs, and the graph must not contain cycles. If a cycle exists in the graph, topological sorting is impossible because nodes in the cycle are mutually dependent, making a definitive order indeterminable.
- In-degree: The number of edges pointing to a node is called its in-degree. For example, if there is an edge \(A \to B\), the in-degree of node \(B\) increases by 1.
- Out-degree: The number of edges pointing out from a node is called its out-degree.
Algorithm Steps (Kahn's Algorithm)
Kahn's algorithm is an in-degree-based topological sorting algorithm. Its core idea is to repeatedly select nodes with an in-degree of 0, remove these nodes and their outgoing edges, and update the in-degrees of adjacent nodes. Specific steps are as follows:
-
Initialization:
- Calculate the in-degree of each node in the graph and record it.
- Initialize a queue (or stack) to store all nodes currently with an in-degree of 0.
- Initialize a list to store the topological sorting result.
-
Processing Nodes with In-degree 0:
- Dequeue nodes with an in-degree of 0 from the queue one by one and add them to the result list.
- For each dequeued node \(u\), traverse all its adjacent nodes \(v\) (i.e., edges \(u \to v\) exist) and decrease the in-degree of \(v\) by 1. If after decreasing, the in-degree of \(v\) becomes 0, add \(v\) to the queue.
-
Checking for Cycles:
- If the number of nodes in the result list equals the total number of nodes in the graph, the topological sorting is successful.
- If the number of nodes in the result list is less than the total number of nodes in the graph, it indicates the presence of a cycle, making topological sorting impossible.
Example Demonstration
Assume a directed graph with nodes \(A, B, C, D, E\) and edges:
- \(A \to B\)
- \(A \to C\)
- \(B \to D\)
- \(C \to D\)
- \(D \to E\)
Step 1: Initialization
- Calculate the in-degree of each node:
- \(A: 0\)
- \(B: 1\) (from \(A\))
- \(C: 1\) (from \(A\))
- \(D: 2\) (from \(B\) and \(C\))
- \(E: 1\) (from \(D\))
- Initialize the queue as \([A]\) (since only \(A\) has an in-degree of 0).
- Initialize the result list as empty.
Step 2: Processing Nodes
- Dequeue \(A\), add to result list: result = \([A]\).
- Update adjacent nodes \(B\) and \(C\) of \(A\):
- Decrease the in-degree of \(B\) by 1, making it 0, and add \(B\) to the queue.
- Decrease the in-degree of \(C\) by 1, making it 0, and add \(C\) to the queue.
- The queue becomes \([B, C]\).
- Update adjacent nodes \(B\) and \(C\) of \(A\):
- Dequeue \(B\), add to result list: result = \([A, B]\).
- Update adjacent node \(D\) of \(B\):
- Decrease the in-degree of \(D\) by 1, making it 1 (originally 2, minus the edge from \(B\)).
- The queue remains as \([C]\).
- Update adjacent node \(D\) of \(B\):
- Dequeue \(C\), add to result list: result = \([A, B, C]\).
- Update adjacent node \(D\) of \(C\):
- Decrease the in-degree of \(D\) by 1, making it 0, and add \(D\) to the queue.
- The queue becomes \([D]\).
- Update adjacent node \(D\) of \(C\):
- Dequeue \(D\), add to result list: result = \([A, B, C, D]\).
- Update adjacent node \(E\) of \(D\):
- Decrease the in-degree of \(E\) by 1, making it 0, and add \(E\) to the queue.
- The queue becomes \([E]\).
- Update adjacent node \(E\) of \(D\):
- Dequeue \(E\), add to result list: result = \([A, B, C, D, E]\).
Step 3: Checking the Result
- The result list contains 5 nodes, matching the total number of nodes in the graph, indicating successful topological sorting. The sorting result is \(A \to B \to C \to D \to E\) (Note: the order of \(B\) and \(C\) can be swapped because there is no dependency between them).
Time Complexity and Space Complexity
- Time Complexity: \(O(V + E)\), where \(V\) is the number of nodes and \(E\) is the number of edges. Each node and edge is processed only once.
- Space Complexity: \(O(V)\), used to store the in-degree array, queue, and result list.
Application Scenarios
Topological sorting is commonly used to solve dependency-related problems, such as:
- Course scheduling: certain courses must be completed before others can be taken.
- Task scheduling: some tasks must be completed before others.
- Compilation order: dependency relationships in module compilation.
Through the above steps, one can systematically understand the principles and implementation methods of topological sorting. If a cycle exists in the graph, the algorithm terminates early, thus detecting the presence of the cycle.