[Tutor] Class-based generator

Michael O'Leary michael at seomoz.org
Mon Feb 18 08:36:25 CET 2013

I wrote some code to create tasks to be run in a queue based system last
week. It consisted of a big monolithic function that consisted of two parts:
1) read data from a file and create dictionaries and lists to iterate
2) iterate through the lists creating a job data file and a task for the
queue one at a time until all of the data is dealt with

My boss reviewed my code and said that it would be more reusable and
Pythonic if I refactored it as a generator that created job data files and
iterated by calling the generator and putting a task on the queue for each
job data file that was obtained.

This made sense to me, and since the code does a bunch of conversion of the
data in the input file(s) to make it easier and faster to iterate through
the data, I decided to create a class for the generator and put that
conversion code into its __init__ function. So the class looked like this:

class JobFileGenerator:
    def __init__(self, filedata, output_file_prefix, job_size):
        <convert filedata to a more usable form>

    def next(self):
        while <there is more data>:
            <yield a job data file>

The problem is that the generator object is not created until you call
next(), so the calling code has to look like this:

gen = JobFileGenerator(data, "output_", 20).next()
for datafile in gen.next():
    <put a job that uses datafile into the queue>

This code works OK, but I don't like that it needs to call next() once to
get a generator and then call next() again repeatedly to get the data for
the jobs. If I were to write this without a class as a single generator
function, it would not have to do this, but it would have the monolithic
structure that my boss objected to.

Would it work to do this:

for datafile in JobFileGenerator(data, "output_", 20).next():
    <put a job that uses datafile into the queue>

or would that cause the JobFileGenerator's __init__ function to be called
more than once? Are there examples I could look at of generator functions
defined on classes similar to this, or is it considered a bad idea to mix
the two paradigms?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20130217/cef81f7c/attachment-0001.html>

More information about the Tutor mailing list