Principles and Operations of B+ Tree

Principles and Operations of B+ Tree

B+ Tree is a balanced multi-way search tree widely used in database and file system index structures. It is similar to B-tree, but internal nodes do not store data; all data is stored in leaf nodes, and leaf nodes are connected by pointers to form an ordered linked list.

I. Basic Properties of B+ Tree

Each node has at most m child nodes (m-order B+ tree)
The root node has at least two child nodes (unless it is the only node)
Internal nodes (non-root, non-leaf) have at least ⌈m/2⌉ child nodes
All leaf nodes are at the same level, forming complete balance
Internal nodes only store keys (index keys), not actual data
Leaf nodes store all key-value pairs and are connected by pointers

II. Search Operation in B+ Tree
The search process starts from the root node and proceeds downward layer by layer:

Start from the root node, find the first position greater than or equal to the target key in the current node
Follow the corresponding child node pointer to continue the search
Repeat this process until reaching a leaf node
Perform a sequential or binary search within the leaf node

Example: Searching for key value 25 in a 3-order B+ tree

Root node: [10, 20, 30]  // Find 25∈[20,30), go to the second child node
Intermediate node: [20, 25, 28]  // Find 25∈[25,28), go to the first child node
Leaf node: [25, 26, 27]  // Find the target key

III. Insertion Operation in B+ Tree
The insertion process must maintain balance, with steps including:

Find the appropriate leaf node position
If the leaf node has space, insert directly while maintaining order
If the leaf node is full (number of keys = m), splitting is required:
a. Create a new leaf node
b. Evenly distribute the key-value pairs from the original node
c. Copy the minimum key of the new node to the parent node
d. If the parent node is full, recursively split the parent node

Example: Inserting 10, 20, 5, 15 sequentially into a 3-order B+ tree (m=3)

Initial: [10]
Insert 20: [10,20]  // Leaf node not full
Insert 5: [5,10,20]  // Leaf node full, needs splitting
After splitting:
Root node: [10]
Leaf node 1: [5,10] → Leaf node 2: [20]
Insert 15: Find the leaf node where it should be inserted [5,10,15]  // Splitting again
Final structure:
Root node: [10,15]
Leaf node 1: [5,10] → Leaf node 2: [15,20]

IV. Deletion Operation in B+ Tree
Deletion also requires maintaining balance:

Find the leaf node containing the target key
Delete the key-value pair; if the minimum key requirement (≥⌈m/2⌉-1) is still met, end
If the number of keys is insufficient, consider merging or redistribution:
a. If a sibling node has surplus, borrow a key
b. Otherwise, merge with a sibling node
c. Update the index keys in the parent node

V. Advantages of B+ Tree Compared to B-tree

High efficiency for range queries: leaf node linked list supports sequential traversal
More stable queries: all queries must go to leaf nodes
More compact internal nodes: same memory can store more keys
More suitable for disk I/O: node size is typically aligned with disk pages

Through these characteristics, B+ Tree performs excellently in data storage scenarios requiring extensive disk operations, becoming the de facto standard for database indexing.