Today we’ll be talking about generators.
A generator is a function that yields a sequence of values using the yield statement.
When you call a generator function, it returns a generator object. The execution of the generator function is interrupted when a yield statement is reached. Then the value following the yield statement is returned. The generator function is not executed until the next method is called again on the generator object. When this happens, the generator function resumes execution at the point where it left off and in the state in which it was when interrupted after the last yield statement, so all local variables are saved between calls. The execution continues until it reaches the next yield statement, which again returns the yielded value.
The way generator functions execute is very characteristic of them. Regular functions always start execution from the beginning, no matter where they left off in the previous call. Generator functions resume execution at the point where they left off.
Table of Contents
Regular Function vs Generator
To better visualize the difference between a regular function and a generator, let’s create a simple function and a simple generator that look similar. Then we’ll see how different they are.
Regular Functions
Here’s a regular function:
def animal_function():
return "lion"
return "seal"
return "yak"
return "bison"
print(animal_function())
print(animal_function())
print(animal_function())
print(animal_function())
So, the function just returns a string. When we call the function, we always get the first string returned in the function definition. This is because when return is reached, the value after return is returned and execution stops. The rest of the code, i. e. the other return statements will be never reached.
Generators
How about using a generator? In order to turn a regular function into a generator, we need to use the yield statements. Here’s our generator:
>>> def animal_generator():
... yield "lion"
... yield "seal"
... yield "yak"
... yield "bison"
This generator function generates a generator object, which is an iterator.
If you want to learn more about iterables and iterators, I have an article devoted to this topic, so feel free to read it.
And now let’s assign the generator object to a variable:
>>> animal = animal_generator()
The next Function
Now we can iterate over all the four elements of the animal iterator using the next function. When we call next for the first time, the generator function will execute until it reaches the first yield statement. Then it will yield “lion” and execution will stop:
>>> next(animal)
'lion'
When we call the next function on the iterator again, the animal_generator will resume where it left off, so after the first yield statement. It will now execute until it reaches the second yield statement. Then it will yield “seal” and stop again:
>>> next(animal)
'seal'
On the next call of the next function on the animal iterator, the generator will resume execution and continue until the third yield statement is reached:
>>> next(animal)
'yak'
After yielding “bison”, it’ll stop again:
>>> next(animal)
'bison'
When we call next on the iterator again, the generator function will resume again, but this time there will be no more yield statements, so a StopIteration exception will be returned:
>>> next(animal)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
Now, when the iterator object is exhausted, it can’t be reset. But we can always create a new iterator the same way as before:
>>> animal = animal_generator() # let’s create a new iterator
Powers of Two Generator
In the following example we define a generator, which generates an iterator for the powers of 2. powers_of_two is the generator function and pot is the generator object. Next, in the for loop we iterate over the elements of the iterator using the next method:
def powers_of_two(exponent):
min_exponent = 0
while min_exponent <= exponent:
yield 2 ** min_exponent
min_exponent += 1
pot = powers_of_two(8)
for i in range(9):
print(next(pot))
Here’s the output:
1
2
4
8
16
32
64
128
256
For better understanding let’s have a closer look at what is going on when you run the code above:
def powers_of_two(exponent):
min_exponent = 0
while min_exponent <= exponent:
yield 2 ** min_exponent
min_exponent += 1
# We call the generator for the first time.
# It returns an iterator (the generator object).
pot = powers_of_two(8)
for i in range(9):
# When we first use the iterator in the for loop,
# which we do by calling the next method, the
# execution of the generator powers_of_two starts,
# just like with any other function.
print(next(pot))
– The code in the body of the generator is executed until it reaches the first yield statement inside the while loop.When this happens, the yield statement returns the value of the expression 2 ** min_exponent, which is 2 ** 0 = 1. Now the execution stops but the values of any local variables (in this case just the variable min_exponent) is saved.
– When the next method in the for loop is called again, the generator resumes execution at the line directly following the yield statement, so at min_exponent += 1. Then the loop condition is checked and as min_exponent is still less than exponent, the loop runs again. And again the yield statement is reached. This time the value of min_exponent is 1, so the yield returns 2 ** 1 = 2. The execution stops again.
– This goes on like that until the iterator is exhausted or after the execution of the program is ended in some other way like by means of a return statement. In our example the for loop is called 9 times and this is just the right number for the iterator to exhaust itself. One more iteration of the for loop would cause the StopIteration exception.
Infinite Iterators
Generators can also return infinite iterators. In such a case it’s important to make sure there’s a way to terminate the execution somewhere in the code. Here’s our last example rewritten so that the generator creates an infinite iterator. In this case it’s quite simple because we have the for loop which only runs a limited number of times, so the termination will occur when the for loop is finished. Here’s the code:
def powers_of_two():
min_exponent = 0
while True:
yield 2 ** min_exponent
min_exponent += 1
pot = powers_of_two()
for i in range(8):
print(next(pot))
Now if we change the number of the for loop iterations, we will not get the StopIteration exception, because the iterator will not have exhausted itself.
Raising the StopIteration Exception
As mentioned before the condition that must be met for a generator to be a generator at all is that there must be at least one yield statement. But there can be a return statement as well. Using the return statement in a generator causes the generator to return a StopIteration exception. We can raise this exception explicitly in two ways:
1) using raise StopIteration:
>>> def some_generator():
... yield 1
... yield 2
... raise StopIteration
...
>>> it = some_generator()
>>>
>>> next(it)
1
>>> next(it)
2
>>> next(it)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 4, in some_generator
StopIteration
2) using a return statement:
>>> def some_generator():
... yield 1
... yield 2
... return
...
>>> it = some_generator()
>>>
>>> next(it)
1
>>> next(it)
2
>>> next(it)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
Here’s the video version of the article: