How to change a generator ? - resolved

Barak, Ron Ron.Barak at
Thu Dec 25 08:27:23 CET 2008

Hi  Gabriel,

Your remarks fixed my problem. Now my code looks as below, and behaves as expected.

Thanks Gabriel.

Merry Christmas and Happy Hanukkah,

$ cat
#!/usr/bin/env python

import gzip
from Debug import _line as line

class LogStream():

    def __init__(self, filename):
        self.filename = filename
        self.input_file = self.open_file(filename)

    def open_file(self, in_file):
            f = gzip.GzipFile(in_file, "r")
        except IOError:
            f = open(in_file, "r")

    def line_generator(self):
        print line()+". self.input_file.tell()==",self.input_file.tell()
        while True:
            line_ = self.input_file.readline()
            print line()+". self.input_file.tell()==",self.input_file.tell()
            if not line_:
            yield line_.strip()

if __name__ == "__main__":

    filename = "sac.log.50lines"
    log_stream = LogStream(filename)
    line_generator = log_stream.line_generator()
    line_ =

$ python
23. self.input_file.tell()== 0
26. self.input_file.tell()== 247

$ !wc
wc -c sac.log.50lines
6623 sac.log.50lines


-----Original Message-----
From: MRAB [mailto:google at]
Sent: Wednesday, December 24, 2008 20:00
To: python-list at
Subject: Re: How to change a generator ?

Gabriel Genellina wrote:
> En Wed, 24 Dec 2008 15:03:58 -0200, MRAB <google at>
> escribió:
>>>  I have a generator whose aim is to returns consecutive lines from a
>>> file (the listing below is a simplified version).
>>> However, as it is written now, the generator method changes the text
>>> file pointer to end of file after first invocation.
>>> Namely, the file pointer changes from 0 to 6623 on line 24.
>> It might be that the generator method of self.input_file is reading
>> the file a chunk at a time for efficiency even though it's yielding a
>> line at a time.
> I think this is the case too.
> I can think of 3 alternatives:
> a) open the file unbuffered (bufsize=0). But I think this would
> greatly decrease performance.
> b) keep track internally of file position (by adding each line length).
> The file should be opened in binary mode in this case (to avoid any '\n'
> translation).
> c) return line numbers only, instead of file positions. Seeking to a
> certain line number requires to re-read the whole file from start;
> depending on how often this is required, and how big is the file, this
> might be acceptable.
readline() appears to work as expected, leaving the file position at the start of the next line.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the Python-list mailing list