[Tutor] Class-based generator
Michael O'Leary
michael at seomoz.org
Mon Feb 18 08:36:25 CET 2013
I wrote some code to create tasks to be run in a queue based system last
week. It consisted of a big monolithic function that consisted of two parts:
1) read data from a file and create dictionaries and lists to iterate
through
2) iterate through the lists creating a job data file and a task for the
queue one at a time until all of the data is dealt with
My boss reviewed my code and said that it would be more reusable and
Pythonic if I refactored it as a generator that created job data files and
iterated by calling the generator and putting a task on the queue for each
job data file that was obtained.
This made sense to me, and since the code does a bunch of conversion of the
data in the input file(s) to make it easier and faster to iterate through
the data, I decided to create a class for the generator and put that
conversion code into its __init__ function. So the class looked like this:
class JobFileGenerator:
def __init__(self, filedata, output_file_prefix, job_size):
<convert filedata to a more usable form>
def next(self):
while <there is more data>:
<yield a job data file>
The problem is that the generator object is not created until you call
next(), so the calling code has to look like this:
gen = JobFileGenerator(data, "output_", 20).next()
for datafile in gen.next():
<put a job that uses datafile into the queue>
This code works OK, but I don't like that it needs to call next() once to
get a generator and then call next() again repeatedly to get the data for
the jobs. If I were to write this without a class as a single generator
function, it would not have to do this, but it would have the monolithic
structure that my boss objected to.
Would it work to do this:
for datafile in JobFileGenerator(data, "output_", 20).next():
<put a job that uses datafile into the queue>
or would that cause the JobFileGenerator's __init__ function to be called
more than once? Are there examples I could look at of generator functions
defined on classes similar to this, or is it considered a bad idea to mix
the two paradigms?
Thanks,
Mike
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20130217/cef81f7c/attachment-0001.html>
More information about the Tutor
mailing list