Python2.2 + mailbox. Bug????
Heiko Wundram
heikowu at ceosg.de
Fri May 21 13:32:03 EDT 2004
Am Freitag, 21. Mai 2004 18:18 schrieb Noam Raphael:
> I assume that when you write "./program.py < mboxfile.txt", Python knows
> that sys.stdin is a regular file, so it can do seek on it (for example,
> go back to its beginning).
It's not Python that knows that you can do a seek on the file, but what the
shell does when you pipe a file to a program is to call filefd =
open(file,"r"); fdup2(filefd,0) (0 = stdin) just before the shell forks to
start the program. sys.stdin is always just connected to the filedescriptor 0
which was passed in, which in turn is connected to a file file-descriptor by
the shell, which in turn is seekable.
> When you write "cat mboxfile.txt |
> program.py", the program cat outputs the file mboxfile.txt into
> sys.stdin byte by byte, so you can't do seek on it.
Now, when you pipe something into another program, exactly this gets
generated: a pipe is generated using readfd, writefd = pipe() whose write end
is connected to the stdout (fdup2(writefd,1); fd 1 = stdout) of the first
program when the shell forks to start it, and whose read end is connected to
stdin (fdup2(readfd,0); fd 0 = stdin, as before) of the second program, again
when the shell forks to start it. This means that sys.stdin of the Python
program, which again is connected to filedescriptor 0 is now connected to a
pipe file-descriptor. Pipes are not seekable, and that's exactly what the
exception is telling you.
So, what do we learn from this? The mbox format needs a filedescriptor which
is seekable to be able to parse it (err, I guess it wouldn't need this, but
who knows, look at the source luke!), so you need to pass a reference to a
file-like object which implements seek (or at least a file-descriptor which
is seekable, which pipes are not).
HTH!
Heiko.
More information about the Python-list
mailing list