On this page
article
Iterators & Generators
Understand Python iteration protocol, iterators, generators, yield, and itertools for memory-efficient data processing.
Iteration is central to Python. Every for loop uses the iterator protocol — and generators let you build lazy, memory-efficient sequences.
The Iteration Protocol
An object is iterable if it implements __iter__(), returning an iterator. An iterator implements __iter__() and __next__():
nums = [1, 2, 3]
it = iter(nums)
next(it) # 1
next(it) # 2
next(it) # 3
next(it) # StopIteration
for x in nums calls iter(nums) then repeatedly calls next() until StopIteration.
Custom Iterator Class
class Countdown:
def __init__(self, start):
self.current = start
def __iter__(self):
return self
def __next__(self):
if self.current <= 0:
raise StopIteration
self.current -= 1
return self.current + 1
for n in Countdown(5):
print(n) # 5, 4, 3, 2, 1
Generators — Simple Iterators
Use yield instead of return to create a generator function:
def countdown(start):
while start > 0:
yield start
start -= 1
for n in countdown(5):
print(n)
Generators are lazy — they produce values one at a time, using constant memory.
Generator Expressions
Like list comprehensions but lazy:
# List comprehension — builds entire list in memory
squares_list = [x**2 for x in range(1_000_000)]
# Generator expression — yields one at a time
squares_gen = (x**2 for x in range(1_000_000))
sum(squares_gen) # consumes the generator
Useful Generator Patterns
Reading Large Files
def read_lines(path):
with open(path) as f:
for line in f:
yield line.strip()
for line in read_lines("huge_log.txt"):
if "ERROR" in line:
print(line)
Pipeline Processing
def read_numbers(path):
with open(path) as f:
for line in f:
yield int(line.strip())
def filter_even(numbers):
for n in numbers:
if n % 2 == 0:
yield n
def square(numbers):
for n in numbers:
yield n ** 2
pipeline = square(filter_even(read_numbers("nums.txt")))
for result in pipeline:
print(result)
Infinite Sequences
def fibonacci():
a, b = 0, 1
while True:
yield a
a, b = b, a + b
fib = fibonacci()
first_10 = [next(fib) for _ in range(10)]
# [0, 1, 1, 2, 3, 5, 8, 13, 21, 34]
itertools Module
import itertools
# Infinite iterators
itertools.count(10, 2) # 10, 12, 14, ...
itertools.cycle("AB") # A, B, A, B, ...
itertools.repeat(7, 3) # 7, 7, 7
# Combinatorics
itertools.combinations([1,2,3,4], 2) # pairs
itertools.permutations([1,2,3], 2) # ordered pairs
itertools.product([0,1], repeat=3) # binary combos
# Chaining and grouping
itertools.chain([1,2], [3,4]) # 1, 2, 3, 4
itertools.islice(range(100), 5, 10) # 5, 6, 7, 8, 9
# Group consecutive items
data = [("a", 1), ("a", 2), ("b", 3)]
for key, group in itertools.groupby(data, key=lambda x: x[0]):
print(key, list(group))
Sending Values to Generators
def accumulator():
total = 0
while True:
value = yield total
if value is not None:
total += value
acc = accumulator()
next(acc) # prime the generator → 0
acc.send(10) # 10
acc.send(20) # 30
acc.send(5) # 35
When to Use Generators
| Use generators when… | Use lists when… |
|---|---|
| Data doesn’t fit in memory | You need random access by index |
| Processing pipelines | You iterate multiple times |
| Infinite or unknown-length sequences | You need len() or slicing |
Generators are one of Python’s most powerful features for writing efficient, elegant data processing code.