Iterators and Generators in Python

Iterators and Generators in Python

Description
Iterators and Generators are important concepts in Python for handling iterable objects. Iterators allow you to access elements in a container one by one without exposing its underlying structure, while generators are a more concise way to implement iterators, dynamically generating values using functions and the yield keyword. Understanding them helps in efficiently processing large data streams or lazy computation scenarios.

1. Basic Concepts of Iterators

  • Iterable Object: Any object that implements the __iter__ method (such as lists, tuples, strings), which returns an iterator.
  • Iterator: An object that implements both the __iter__ method (returns itself) and the __next__ method (returns the next value, raising a StopIteration exception when no values remain).
  • Example:
    my_list = [1, 2, 3]
    iter_obj = iter(my_list)  # Equivalent to my_list.__iter__()
    print(next(iter_obj))  # Outputs 1 (equivalent to iter_obj.__next__())
    print(next(iter_obj))  # Outputs 2
    

2. Custom Iterators
Implement the iterator protocol via a class:

class Counter:
    def __init__(self, start, end):
        self.current = start
        self.end = end
    
    def __iter__(self):
        return self  # Returns the iterator itself
    
    def __next__(self):
        if self.current >= self.end:
            raise StopIteration
        value = self.current
        self.current += 1
        return value

# Usage example
for num in Counter(0, 3):
    print(num)  # Outputs 0, 1, 2

Key Point: An iterator can only be traversed once; it must be reset to be reused after traversal.

3. Introduction to Generators
Custom iterators require writing classes, which can be cumbersome. Generators provide a simpler implementation method:

  • Generator Function: Uses the yield keyword instead of return. Each call to next() executes up to the yield point, pauses, retains state, and resumes execution next time.
  • Example:
    def counter_generator(start, end):
        current = start
        while current < end:
            yield current
            current += 1
    
    gen = counter_generator(0, 3)
    print(next(gen))  # Outputs 0
    print(next(gen))  # Outputs 1
    

Generator functions return a generator object (a type of iterator), eliminating the need to manually implement __iter__ and __next__.

4. Advantages of Lazy Evaluation with Generators
Generators produce values on demand, saving memory and making them suitable for handling large-scale data streams:

def infinite_sequence():
    num = 0
    while True:
        yield num
        num += 1

gen = infinite_sequence()
print(next(gen))  # Outputs 0
print(next(gen))  # Outputs 1
# Does not occupy infinite memory; only one value is generated at a time

5. Generator Expressions
Similar to list comprehensions, but using parentheses, returning a generator object:

squares_gen = (x**2 for x in range(5))
print(next(squares_gen))  # Outputs 0
print(list(squares_gen))   # Outputs [1, 4, 9, 16] (note: 0 has already been consumed)

Compared to list comprehensions (which return all results immediately), generator expressions evaluate lazily and are more memory-efficient.

6. Differences Between Iterators and Generators

  • Iterator: Requires explicit implementation of the class protocol, suitable for complex state management.
  • Generator: Simplified implementation via functions and yield, suitable for sequential data flows.
  • Commonality: Both support for loops and the next() function, and can only be traversed unidirectionally once.

Summary
Iterators and generators are core tools for lazy computation in Python. Iterators standardize traversal behavior through a protocol, while generators, with their concise syntax, have become a more commonly used implementation method, significantly improving efficiency especially when handling streaming data or infinite sequences.