AVL Tree (Balanced Binary Search Tree)

AVL Tree (Balanced Binary Search Tree)

An AVL tree is a self-balancing binary search tree proposed by Adelson-Velsky and Landis in 1962. Its core property is: for any node in the tree, the absolute difference in height (balance factor) between its left and right subtrees does not exceed 1. This constraint ensures the tree's height remains at O(log n) level, guaranteeing time complexities of O(log n) for search, insertion, and deletion operations.

1. Core Concept: Balance Factor

The balance factor is the key metric for judging balance in an AVL tree. For any node, its balance factor is defined as:
Balance Factor = Height of Left Subtree - Height of Right Subtree
According to the definition of an AVL tree, valid balance factors can only be -1, 0, or 1. If a node's balance factor absolute value exceeds 1, that node becomes an "unbalanced node," and the entire tree loses balance, requiring "rotation" operations to restore balance.

2. Node Structure

A node in an AVL tree typically contains the following fields:

  • key or value: The data stored in the node.
  • left: Pointer to the left child node.
  • right: Pointer to the right child node.
  • height: The height of this node. A node's height is defined as the number of edges on the path from this node to its farthest leaf node (or the number of nodes; the definition must be consistent; here, we use the number of edges). A leaf node's height is usually defined as 0, and an empty node's height is defined as -1.

3. Balancing Operations: Rotations

After inserting or deleting a node, starting from the insertion/deletion point and moving upwards, the first ancestor node whose balance factor becomes 2 or -2 is found; this node is the unbalanced node. There are four rotation operations for different imbalance scenarios:

  • Case 1: Left-Left Case (LL)

    • Description: A new node is inserted into the left subtree of the left subtree of the unbalanced node, causing the unbalanced node's balance factor to become 2, and its left child's balance factor to be 1 or 0 (typically 1).
    • Operation: Right Rotation
      1. Promote the unbalanced node's (call it A) left child (call it B) to be the new root node (new subtree root).
      2. Attach node B's original right subtree (call it T2) as node A's new left subtree.
      3. Attach node A as node B's new right subtree.
    • Effect: After rotation, the tree regains balance and maintains the binary search tree property.
  • Case 2: Right-Right Case (RR)

    • Description: A new node is inserted into the right subtree of the right subtree of the unbalanced node, causing the unbalanced node's balance factor to become -2, and its right child's balance factor to be -1 or 0 (typically -1).
    • Operation: Left Rotation
      1. Promote the unbalanced node's (A) right child (call it C) to be the new root node.
      2. Attach node C's original left subtree (T2) as node A's new right subtree.
      3. Attach node A as node C's new left subtree.
    • Effect: After rotation, the tree regains balance.
  • Case 3: Left-Right Case (LR)

    • Description: A new node is inserted into the right subtree of the left subtree of the unbalanced node, causing the unbalanced node's balance factor to become 2, and its left child's balance factor to be -1.
    • Operation: Left Rotation then Right Rotation
      1. First, perform a left rotation on the unbalanced node's left child (B), transforming the situation into the familiar LL case. This left rotation is performed on B and its right child (C).
      2. Then, perform a right rotation on the unbalanced node (A). This restores balance.
  • Case 4: Right-Left Case (RL)

    • Description: A new node is inserted into the left subtree of the right subtree of the unbalanced node, causing the unbalanced node's balance factor to become -2, and its right child's balance factor to be 1.
    • Operation: Right Rotation then Left Rotation
      1. First, perform a right rotation on the unbalanced node's right child (C), transforming the situation into the familiar RR case.
      2. Then, perform a left rotation on the unbalanced node (A).

4. Full Process of Insertion Operation

Suppose we want to insert a series of numbers into an initially empty AVL tree, using the insertion sequence [10, 20, 30, 40, 50, 25] as an example:

  1. Insert 10: The tree is empty, directly create root node 10. Node 10's height is 0, balance factor is 0.
  2. Insert 20: As the right child of 10. Update node 10's height to 1. Node 10's balance factor = (height of left subtree -1) - (height of right subtree 0) = (-1) - 0 = -1. Balanced, no rotation needed.
  3. Insert 30: As the right child of 20.
    • Starting from 30, check balance factors upwards:
      • Node 20: balance factor = 0 - 1 = -1, balanced.
      • Node 10: balance factor = (-1) - 1(height of 20) = -2. Unbalanced!
    • The unbalanced node is 10. Case determination: new node 30 is in the right subtree of 10's right child (20), which is an RR case.
    • Perform Left Rotation:
      • Promote node 20 to be the subtree's root.
      • Attach 20's original left subtree (currently empty) as 10's right subtree.
      • Attach 10 as 20's left subtree.
    • After rotation, the tree structure becomes 20 as root, 10 as left child, 30 as right child. All nodes' balance factor absolute values <=1, tree is balanced.
  4. Insert 40: As the right child of 30. Backtrack path: 40->30->20. All node balance factors are normal, no rotation needed.
  5. Insert 50: As the right child of 40.
    • Backtrack upwards:
      • Node 40: balance factor = (-1) - 1 = -2? Wait, we need to calculate heights first. Leaf node 50 height is 0. Node 40 height is 1. Node 30: right subtree height is 2 (height of 40 + 1), left subtree height is -1 (empty), balance factor = (-1) - 2 = -3? Note: The first unbalanced node we encounter backtracking is 30 (balance factor -2).
    • The unbalanced node is 30. Case: 50 is in the right subtree of 30's right child (40), which is an RR case.
    • Perform Left Rotation (on the subtree rooted at 30):
      • Promote 40 to be the subtree root.
      • Attach 40's original left subtree (empty) as 30's right subtree.
      • Attach 30 as 40's left subtree.
    • After rotation, the local tree structure is 40(left:30, right:50). Check the entire tree: root node 20's balance factor = 1(height of left subtree 10 is 0, so left subtree height is 0+1=1) - 2(height of new subtree root 40 is 1, so right subtree height is 1+1=2) = -1. Balanced.
  6. Insert 25: As the right child of 20's left child 10? No, 20's left child is 10, 10's right subtree is empty, so 25 should be inserted as 10's right child.
    • Backtrack and check:
      • Node 10: balance factor = (-1) - 1(height of 25 is 0) = 0? Wait, after inserting 25, 10's height becomes 1. Balance factor=0.
      • Node 20: balance factor = 2(height of 10 is 1+1=2) - 2(height of 40 is 1+1=2) = 0. Balanced.
      • Node 40: balance factor = 1(height of 30 is 0+1=1) - 1(height of 50 is 0+1=1) = 0.
    • All nodes balanced, no rotation needed.

5. Brief Description of Delete Operation

The delete operation is more complex than insertion because imbalance can propagate upwards to the root.

  1. Perform the standard BST delete operation (three cases: delete leaf node, delete node with one child, delete node with two children).
  2. Starting from the parent of the deleted node, backtrack upwards to the root, checking the balance factor of each ancestor node on the path.
  3. If an ancestor node is found to be unbalanced, perform the corresponding rotation operation based on the four imbalance cases mentioned above (LL, RR, LR, RL).
  4. Key Point: Rotation operations may change subtree heights, so even after rotating and fixing one unbalanced node, you must continue backtracking and checking upwards until the root, because imbalance can propagate upwards.

Summary
AVL trees enforce balance in binary search trees by introducing balance factors and rotation operations, thereby optimizing worst-case performance. Their advantage is very stable search efficiency. Their disadvantage is that insertion and deletion operations, due to the need to maintain balance, have relatively higher overhead with more rotations. In practical applications, if frequent insertions and deletions are required, Red-Black Trees might be a more common choice; if query operations far outnumber update operations, AVL trees are an excellent choice.