In this video, Josh McQuiston gives a high level description of what Python Generators are. Learn how to use this powerful, built in Python tool to easily and intuitively create iterator objects. He will explain how iterators allow memory efficient iteration over data. Generators are useful for iteration over datasets without the need to store them in memory.
- Python generators are a great tool that makes the creation of iterator objects as easy as writing a simple function. When dealing with large data sets or memory-intensive operations, exactly what kind of tools you use can be very important and can make the difference between needlessly hogging up a lot of memory and efficiently iterating over it in a lazy item by item fashion. To understand what iterators are, imagine when you go to the Department of Motor Vehicles, and you have to grab a ticket from the ticket machine.
They could just print out a stack of all the numbers they will ever need, but since they don't know how many they will need, it makes more sense to print out just one at a time. Of course, it has to remember which number it printed out last so that it can stay in order. It wouldn't do any good to print the same number out every time. If you wanted to have a current timestamp on it, then it is essential that the machine waits until you press the button before it prints it. This illustrates a few important concepts of iterators.
It maintains a state of what number it's on so that it can print numbers in the proper sequence. It doesn't know how many it will print. It just knows the next number it needs to print. It doesn't evaluate what time it is until it is triggered to do so which is what we call lazy evaluation. There is no need to store a large number of tickets, so it is space efficient. While regular containers like lists and tuples are a set of data stored in memory, iterators are objects that support a method called next() which grabs items one at a time.
The next() method would be analogous to the button you press on the ticket machine at the DMV. Most iterables such as lists or tuples have an iter() method that returns an iterator. When using a for loop or list comprehension in Python, behind the scenes the interpreter grabs the iterator and calls next() on it in order to iterate over it. In a case where you have very large data sets, files that you want to process, or are processing an infinite stream of data, it makes sense to use lazy evaluation and only evaluate one item at a time because storing it in memory is inefficient and often impossible.
You can design your own class that implements the required machinery to be an iterator. Python generator functions make that process much simpler. Throughout this course, you will hear me use the terms generator function and generator object, or sometimes just generator. A generator function is what you actually program. Calling that function creates the generator object. The term generator by itself refers to the object.
A generator object is an iterator, but not all iterators are generator objects. I hope that gives you a picture in your mind of what generators are and why Python has them so that we can further build on that and learn more about creating and using them.
Share this video
Embed this video
Video: Generators overview