Implementing Trie (Prefix Tree)

Implementing Trie (Prefix Tree)

Problem Description
Trie (pronounced like "try") is a tree data structure used to efficiently store and retrieve keys from a string dataset. This data structure has many application scenarios, such as autocomplete and spell checking. Please implement the Trie class, including insert, search, and prefix search functionality.

Basic Concepts
A Trie is a multi‑branch tree structure. Each node contains the following parts:

  • An array of child node pointers (typically sized 26, corresponding to the 26 lowercase letters)
  • A boolean field isEnd, marking whether this node is the ending character of a word

Implementation Steps

  1. Trie Node Design
class TrieNode:
    def __init__(self):
        self.children = [None] * 26  # child nodes for 26 letters
        self.is_end = False          # marks end of a word
  1. Trie Class Initialization
class Trie:
    def __init__(self):
        self.root = TrieNode()  # create the root node (empty node)
    
    def _char_to_index(self, char):
        """Convert character to index (0‑25)"""
        return ord(char) - ord('a')
  1. Insert Operation
  • Start from the root node
  • Traverse each character of the word:
    • Compute the corresponding index for the character
    • If the child node pointer at the current node is empty, create a new node
    • Move the current node to the child node
  • Mark is_end as True at the last node
def insert(self, word: str) -> None:
    node = self.root
    for char in word:
        index = self._char_to_index(char)
        if not node.children[index]:
            node.children[index] = TrieNode()  # create a new node
        node = node.children[index]  # move to child node
    node.is_end = True  # mark end of word
  1. Search for Complete Word
  • Traverse the word starting from the root node
  • If a child node for a character does not exist, return False
  • After traversal, check the is_end flag of the current node
def search(self, word: str) -> bool:
    node = self.root
    for char in word:
        index = self._char_to_index(char)
        if not node.children[index]:
            return False  # character does not exist
        node = node.children[index]
    return node.is_end  # check if it's a complete word
  1. Prefix Search
  • Similar to search, but only need to confirm that the prefix exists
  • No need to check the is_end flag
def startsWith(self, prefix: str) -> bool:
    node = self.root
    for char in prefix:
        index = self._char_to_index(char)
        if not node.children[index]:
            return False
        node = node.children[index]
    return True  # prefix exists

Complete Code Example

class TrieNode:
    def __init__(self):
        self.children = [None] * 26
        self.is_end = False

class Trie:
    def __init__(self):
        self.root = TrieNode()
    
    def _char_to_index(self, char):
        return ord(char) - ord('a')
    
    def insert(self, word):
        node = self.root
        for char in word:
            index = self._char_to_index(char)
            if not node.children[index]:
                node.children[index] = TrieNode()
            node = node.children[index]
        node.is_end = True
    
    def search(self, word):
        node = self.root
        for char in word:
            index = self._char_to_index(char)
            if not node.children[index]:
                return False
            node = node.children[index]
        return node.is_end
    
    def startsWith(self, prefix):
        node = self.root
        for char in prefix:
            index = self._char_to_index(char)
            if not node.children[index]:
                return False
            node = node.children[index]
        return True

Complexity Analysis

  • Time complexity for insert/search/prefix search: O(L), where L is the word length
  • Space complexity: worst‑case O(N×M), where N is the number of words and M is the average word length

Practical Applications

  • Autocomplete in search engines
  • Spell checker
  • Longest prefix matching in IP routing
  • Word‑bank storage for input methods