Principles and Collision Resolution of Hash Tables

Principles and Collision Resolution of Hash Tables

A hash table is a data structure that uses a hash function to map keys to array indices, enabling efficient lookup. Its core idea is to convert a key into an array subscript, making the time complexity of search, insertion, and deletion operations close to O(1).

1. Hash Function
The hash function is the core of a hash table, responsible for converting keys of any size into array indices within a fixed range. An ideal hash function should satisfy:

Fast computation
Uniform distribution (to reduce collisions)
For example, hashing the string "key": calculate the sum of the ASCII values of each character, then take the modulo of the array size.

2. Hash Collision
A collision occurs when different keys produce the same index through the hash function. For example:

Array size: 10
ASCII sum of key "ab": 97+98=195 → 195%10=5
ASCII sum of key "ba": 98+97=195 → 195%10=5
Both map to the same index 5.

3. Collision Resolution Methods
3.1 Separate Chaining

Each array position maintains a linked list (or another container)
When a collision occurs, the new element is added to the end of the corresponding list
Example: After inserting "ab" and "ba", the linked list at index 5 contains two nodes

3.2 Open Addressing
When a collision occurs, search for the next available position according to a specific rule:

Linear Probing: Sequentially check the next positions (index+1, index+2...)
Quadratic Probing: Check index+1², index+2²... to avoid clustering
During lookup, compare sequentially along the probe sequence until found or an empty slot is encountered

4. Load Factor and Resizing
Load factor = number of elements / array size. When the load factor is too high (e.g., >0.75):

Collision probability increases, performance degrades
A larger array needs to be created, and all elements must be rehashed to new positions

5. Time Complexity Analysis

Ideal case (no collisions): O(1)
Worst case (all collisions): O(n)
Average case (with a good hash function): O(1)