Principles and Operations of B+ Tree

Principles and Operations of B+ Tree

B+ Tree is a balanced multi-way search tree widely used in database and file system index structures. It is similar to B-tree, but internal nodes do not store data; all data is stored in leaf nodes, and leaf nodes are connected by pointers to form an ordered linked list.

I. Basic Properties of B+ Tree

  1. Each node has at most m child nodes (m-order B+ tree)
  2. The root node has at least two child nodes (unless it is the only node)
  3. Internal nodes (non-root, non-leaf) have at least ⌈m/2⌉ child nodes
  4. All leaf nodes are at the same level, forming complete balance
  5. Internal nodes only store keys (index keys), not actual data
  6. Leaf nodes store all key-value pairs and are connected by pointers

II. Search Operation in B+ Tree
The search process starts from the root node and proceeds downward layer by layer:

  1. Start from the root node, find the first position greater than or equal to the target key in the current node
  2. Follow the corresponding child node pointer to continue the search
  3. Repeat this process until reaching a leaf node
  4. Perform a sequential or binary search within the leaf node

Example: Searching for key value 25 in a 3-order B+ tree

Root node: [10, 20, 30]  // Find 25∈[20,30), go to the second child node
Intermediate node: [20, 25, 28]  // Find 25∈[25,28), go to the first child node
Leaf node: [25, 26, 27]  // Find the target key

III. Insertion Operation in B+ Tree
The insertion process must maintain balance, with steps including:

  1. Find the appropriate leaf node position
  2. If the leaf node has space, insert directly while maintaining order
  3. If the leaf node is full (number of keys = m), splitting is required:
    a. Create a new leaf node
    b. Evenly distribute the key-value pairs from the original node
    c. Copy the minimum key of the new node to the parent node
    d. If the parent node is full, recursively split the parent node

Example: Inserting 10, 20, 5, 15 sequentially into a 3-order B+ tree (m=3)

Initial: [10]
Insert 20: [10,20]  // Leaf node not full
Insert 5: [5,10,20]  // Leaf node full, needs splitting
After splitting:
Root node: [10]
Leaf node 1: [5,10] → Leaf node 2: [20]
Insert 15: Find the leaf node where it should be inserted [5,10,15]  // Splitting again
Final structure:
Root node: [10,15]
Leaf node 1: [5,10] → Leaf node 2: [15,20]

IV. Deletion Operation in B+ Tree
Deletion also requires maintaining balance:

  1. Find the leaf node containing the target key
  2. Delete the key-value pair; if the minimum key requirement (≥⌈m/2⌉-1) is still met, end
  3. If the number of keys is insufficient, consider merging or redistribution:
    a. If a sibling node has surplus, borrow a key
    b. Otherwise, merge with a sibling node
    c. Update the index keys in the parent node

V. Advantages of B+ Tree Compared to B-tree

  1. High efficiency for range queries: leaf node linked list supports sequential traversal
  2. More stable queries: all queries must go to leaf nodes
  3. More compact internal nodes: same memory can store more keys
  4. More suitable for disk I/O: node size is typically aligned with disk pages

Through these characteristics, B+ Tree performs excellently in data storage scenarios requiring extensive disk operations, becoming the de facto standard for database indexing.