A* Search Algorithm
A* search algorithm is an efficient algorithm widely used in path planning and graph traversal. It combines the completeness of Dijkstra's algorithm (guaranteeing the shortest path) and the heuristics of the Greedy Best-First Search algorithm (tending to explore towards the goal), guiding the search direction through an evaluation function, thereby being faster than both in most cases.
1. Core Idea: Evaluation Function f(n)
The core of the A* algorithm lies in its calculation of an evaluation function for each node n in the graph:
f(n) = g(n) + h(n)
g(n): This is the actual cost from the start node to the current noden. This value is known and accurate, similar to the distance recorded in Dijkstra's algorithm.h(n): This is the estimated cost from the current nodento the goal node, also known as the heuristic function. This value is an estimate, not a precise value. The quality ofh(n)'s design directly determines the efficiency of the A* algorithm.
The algorithm's goal is to find a path that minimizes the f(n) value of the path's endpoint (goal node).
2. Detailed Algorithm Process
The algorithm needs to maintain two sets:
- Open List: A priority queue (usually a min-heap) containing all discovered but unevaluated nodes. Nodes are sorted by their
f(n)value, with the node having the smallestf(n)having the highest priority. - Closed List: A set (e.g., a hash table) containing all nodes that have been evaluated. The optimal paths for these nodes have been determined and do not need to be processed again.
Steps are as follows:
-
Initialization:
- Add the start node
Sto the open list. Setg(S) = 0, calculate its heuristic valueh(S), and its evaluation valuef(S) = g(S) + h(S). - Initialize the closed list as empty.
- Add the start node
-
Loop Start: While the open list is not empty, repeat the following steps:
a. Select Current Node: Remove the node with the smallestf(n)value from the open list, call it the current nodeN. If using a priority queue for the open list, this is a dequeue operation.
b. Goal Test: If the current nodeNis the goal nodeT, then the search is successful. We can reconstruct the path by backtracking fromTthrough parent nodes.
c. Expand Node: Move nodeNfrom the open list to the closed list, marking it as evaluated. Then, iterate through all neighboring nodesMofN.
d. Process Neighbor NodeM: For each neighborM:
* Skip Closed Nodes: IfMis in the closed list, ignore it directly because its shortest path is already determined.
* Calculate Temporary g-value: Calculate the cost from the start toMviaN:temp_g = g(N) + cost(N, M), wherecost(N, M)is the weight of the edge fromNtoM.
* Determine if a New or Better Path is Found:
* IfMis not in the open list, it is a newly discovered node. AddMto the open list, set itsg(M) = temp_g, calculatef(M) = g(M) + h(M), and recordNas the parent ofM.
* IfMis already in the open list, we have found a new path toM. Compare the cost of the new pathtemp_gwith the previously recordedg(M):
- Iftemp_g < g(M), the new path is better. UpdateM'sg(M)totemp_g, and updatef(M)accordingly. Also, setM's parent node toN. Sincef(M)has changed, adjustM's position in the open list (priority queue).
- Iftemp_g >= g(M), the new path is worse or equal, so do nothing. -
Loop End: If the open list becomes empty, it means all reachable nodes have been explored but the goal node has not been found. The search fails; no path exists from the start to the goal.
3. Key Properties of the Heuristic Function h(n)
The choice of the heuristic function h(n) is crucial. It must satisfy the following conditions to ensure the correctness and efficiency of the A* algorithm:
-
Admissibility:
h(n)must never overestimate the actual minimum cost from the current nodento the goal node. That is,h(n) <= h*(n), whereh*(n)is the true actual cost. This property guarantees that the path found by A* is optimal (shortest).- Example: In a grid map, the Manhattan distance (only up, down, left, right moves) or straight-line distance (Euclidean distance) are admissible, as they are optimistic estimates of the shortest path between two points.
-
Consistency (or Monotonicity): For any node
nand its neighborm, it must satisfyh(n) <= cost(n, m) + h(m). This can be understood as the heuristic function will not "cheat" you into thinking the cost fromnto the goal is greater than the cost of going to a neighbormand then to the goal. Consistency is a stronger form of admissibility. Ifh(n)is consistent, then when the A* algorithm expands a node, the path to that node is guaranteed to be optimal, itsg(n)value never needs to be updated again, and the algorithm is more efficient.
4. A Simple Example
Suppose there is a 4x4 grid, with the start at (0,0) and the goal at (3,3). We can use the Manhattan distance as the heuristic function: h(n) = |x_n - x_goal| + |y_n - y_goal|.
- Start (0,0):
g=0,h=6,f=6. Add to open list. - Remove (0,0) with smallest f for expansion. Check its neighbors (0,1) and (1,0).
- (0,1):
g=1,h=5,f=6. Add to open list. - (1,0):
g=1,h=5,f=6. Add to open list.
- (0,1):
- The open list now contains (0,1) and (1,0), both with f=6. Suppose we remove (0,1) for expansion. Check its neighbors (0,2), (1,1), (0,0).
- (0,2):
g=2,h=4,f=6. Add to open list. - (1,1):
g=2,h=4,f=6. Add to open list. - (0,0) is in the closed list, ignore.
- (0,2):
- The algorithm continues, always prioritizing the expansion of nodes that appear closer to the goal (smaller f value), until it finally finds the goal point (3,3).
Summary
The A* search algorithm intelligently guides the search direction by combining the known cost g(n) and the estimated cost h(n). While guaranteeing an optimal path, it significantly reduces the number of nodes that need to be explored, making it a very powerful and practical tool for solving shortest path problems.