Heap Sort Algorithm

Heap Sort Algorithm

Problem Description:
Heap sort is an efficient sorting algorithm based on the binary heap (complete binary tree) data structure. It utilizes the properties of a heap (max-heap or min-heap) to sort an array. The problem requires you to understand the principle of heap sort and be able to implement it with code.

Solution Process:

  1. Understand Heap Properties:

    • A heap is a complete binary tree, divided into max-heap and min-heap.
      • Max-heap: The value of each parent node is greater than or equal to the values of its child nodes (the root node is the largest).
      • Min-heap: The value of each parent node is less than or equal to the values of its child nodes (the root node is the smallest).
    • Heap sort typically uses a max-heap for ascending order sorting (min-heap is used for descending order).
    • In an array, the node relationships of a heap can be represented by indices:
      • Left child index of parent node i is 2*i + 1.
      • Right child index of parent node i is 2*i + 2.
      • Parent index of child node i is (i-1)/2 (floor division).
  2. Steps of Heap Sort:
    Heap sort consists of two main phases: Heapify (building the heap) and Sorting.

    • Heapify: Adjust the unsorted array into a max-heap.
    • Sorting: Repeatedly swap the top element (maximum value) with the element at the end of the heap, then readjust the heap until the entire array is sorted.
  3. Step-by-Step Implementation:
    Step 1: Implement the Heapify Function

    • Purpose: Adjust the subtree rooted at a given node to satisfy the max-heap property.
    • Process:
      1. Assume the current node i is the maximum.
      2. Compare it with its left and right child nodes to find the index of the maximum value.
      3. If the maximum is not the current node, swap the node values and recursively adjust the subtree of the swapped child node.
    • Example code (for max-heap):
      def heapify(arr, n, i):
          largest = i          # Set current node as maximum
          left = 2 * i + 1     # Left child index
          right = 2 * i + 2    # Right child index
      
          # Compare left child with current node
          if left < n and arr[left] > arr[largest]:
              largest = left
          # Compare right child with current maximum
          if right < n and arr[right] > arr[largest]:
              largest = right
          # If maximum is not the current node, swap and recursively adjust
          if largest != i:
              arr[i], arr[largest] = arr[largest], arr[i]
              heapify(arr, n, largest)  # Adjust the swapped child node's subtree
      

    Step 2: Build Heap (starting from the last non-leaf node)

    • In a complete binary tree, the index of the last non-leaf node is n//2 - 1 (n is the array length).
    • Traverse all non-leaf nodes from back to front, calling heapify on each to ensure the entire array satisfies the max-heap property.
    • Example code:
      def build_max_heap(arr):
          n = len(arr)
          # Traverse from the last non-leaf node to the front
          for i in range(n//2 - 1, -1, -1):
              heapify(arr, n, i)
      

    Step 3: Sorting

    • Swap the top element (maximum) with the current last element of the heap, placing the maximum in its correct position.
    • Reduce the heap's range (excluding sorted elements), adjust the new top element to restore the max-heap property.
    • Repeat until only one element remains in the heap.
    • Example code:
      def heap_sort(arr):
          n = len(arr)
          build_max_heap(arr)  # First build the heap
          # Traverse from back to front, swapping top and last elements
          for i in range(n-1, 0, -1):
              arr[0], arr[i] = arr[i], arr[0]  # Move maximum to the end
              heapify(arr, i, 0)  # Adjust the remaining heap (range reduced to i)
      
  4. Complete Code Example:

    def heap_sort(arr):
        n = len(arr)
        # Build heap
        for i in range(n//2 - 1, -1, -1):
            heapify(arr, n, i)
        # Sorting
        for i in range(n-1, 0, -1):
            arr[0], arr[i] = arr[i], arr[0]
            heapify(arr, i, 0)
    
    def heapify(arr, n, i):
        largest = i
        left = 2*i + 1
        right = 2*i + 2
        if left < n and arr[left] > arr[largest]:
            largest = left
        if right < n and arr[right] > arr[largest]:
            largest = right
        if largest != i:
            arr[i], arr[largest] = arr[largest], arr[i]
            heapify(arr, n, largest)
    
    # Test
    arr = [12, 11, 13, 5, 6, 7]
    heap_sort(arr)
    print("Sorted result:", arr)  # Output [5, 6, 7, 11, 12, 13]
    
  5. Time Complexity and Space Complexity:

    • Time Complexity: Heapify process is O(n), sorting process is O(n log n), overall O(n log n).
    • Space Complexity: Recursive calls to heapify use stack space, worst-case O(log n). It is an in-place sorting algorithm.

Summary:
The key to heap sort lies in understanding the properties of the heap and the adjustment process. By building the heap and repeatedly swapping the top element, the maximum values are gradually moved to the end of the array, ultimately achieving sorting. Pay attention to the recursive or iterative implementation of the heapify function, ensuring the heap property is maintained at each step.