
For some time now, I've wanted to suggest a better abstraction for the <file> type in Python. It currently uses an antiquated, low-level C-style interface for moving around in a file, with methods like tell() and seek(). But after attributes were introduced to Python, it seems like it should be re-evaluated. Let file-type have an attribute .pos for position. Now you can get rid of the seek() and tell() methods and manipulate the file pointer by the more standard and familiar arithmetic operations:
You've now simplified the API by the removal of two obscure legacy methods (where one has to learn the additional concept of "absolute" and "relative" addressing) and replaced them with a more basic one called "position". Thoughts? markj

On 24 September 2012 18:49, Mark Adam <dreamingforward@gmail.com> wrote:
-1 This is not so distant from what can be achieved trivially by tell and seek. Moreover, event though changes in attributes _can_ be made to have side effects in Python objects, it does not mean it is easier to read and maintain in every case. What I think we need is a better way of dealing with constants - the "whence" attribute for "seek" takes raw ints for "from start", "from end" and "relative" - but that is an entirely other subject. js -><-

On 9/24/12, Mark Adam <dreamingforward@gmail.com> wrote:
I agree, but I'm not sure the improvement can be *enough* of an improvement to justify the cost of change.
file.pos = x0ae1 #move file pointer to an absolute address file.pos += 1 #increment the file pointer one byte
For text files, I would expect it to be a character count rather than a byte count. So this particular proposal might end up adding as much confusion as it hopes to remove. -jJ

On Thu, Sep 27, 2012 at 4:00 PM, Guido van Rossum <guido@python.org> wrote:
Also you can't express lseek()'s "relative to end of file" mode using the proposed API. -1 on the whole thing.
You could use negative indexes, which is consistent with subscript and slice interfaces. I still don't know that this is a good idea, but I'm just saying. If someone wants a more sequence-like interface to files, they should use mmap
-- Read my blog! I depend on your acceptance of my opinion! I am interesting! http://techblog.ironfroggy.com/ Follow me if you're into that sort of thing: http://www.twitter.com/ironfroggy

On 28/09/12 06:00, Guido van Rossum wrote:
Also you can't express lseek()'s "relative to end of file" mode using the proposed API. -1 on the whole thing.
For what it's worth, there was extensive discussion on comp.lang.python that eventually decided that while you could express all the various invocations of seek using file.pos, at best you save two characters of typing and the whole thing isn't worth the change. http://mail.python.org/pipermail/python-list/2012-September/thread.html#6315... Personally, I think the proposal has died a natural death, but if anyone wants to resuscitate it, I encourage them to read the above thread before doing so. -- Steven

On 2012-09-27 20:40, Jim Jewett wrote:
In the talk about how to seek to the end of the file with file.pos, it was suggested that negative positions and None could be used. I wonder whether they could be used with seek. For example: file.seek(-10) # Seek 10 bytes from the end. file.seek(None) # Seek to the end.

2012/9/28 MRAB <python@mrabarnett.plus.com>:
See the documentation: http://docs.python.org/library/io.html#io.TextIOBase.seek With text streams, SEEK_CUR and SEEK_END only accept offset=0 (i.e. no move, or go to EOF) and SEEK_SET accepts a "cookie" which was returned a previous tell(). This cookie will often look like the absolute file position, but it also has to contain the codec status, which will be nontrivial for variable-length encodings. -- Amaury Forgeot d'Arc

On 24 September 2012 18:49, Mark Adam <dreamingforward@gmail.com> wrote:
-1 This is not so distant from what can be achieved trivially by tell and seek. Moreover, event though changes in attributes _can_ be made to have side effects in Python objects, it does not mean it is easier to read and maintain in every case. What I think we need is a better way of dealing with constants - the "whence" attribute for "seek" takes raw ints for "from start", "from end" and "relative" - but that is an entirely other subject. js -><-

On 9/24/12, Mark Adam <dreamingforward@gmail.com> wrote:
I agree, but I'm not sure the improvement can be *enough* of an improvement to justify the cost of change.
file.pos = x0ae1 #move file pointer to an absolute address file.pos += 1 #increment the file pointer one byte
For text files, I would expect it to be a character count rather than a byte count. So this particular proposal might end up adding as much confusion as it hopes to remove. -jJ

On Thu, Sep 27, 2012 at 4:00 PM, Guido van Rossum <guido@python.org> wrote:
Also you can't express lseek()'s "relative to end of file" mode using the proposed API. -1 on the whole thing.
You could use negative indexes, which is consistent with subscript and slice interfaces. I still don't know that this is a good idea, but I'm just saying. If someone wants a more sequence-like interface to files, they should use mmap
-- Read my blog! I depend on your acceptance of my opinion! I am interesting! http://techblog.ironfroggy.com/ Follow me if you're into that sort of thing: http://www.twitter.com/ironfroggy

On 28/09/12 06:00, Guido van Rossum wrote:
Also you can't express lseek()'s "relative to end of file" mode using the proposed API. -1 on the whole thing.
For what it's worth, there was extensive discussion on comp.lang.python that eventually decided that while you could express all the various invocations of seek using file.pos, at best you save two characters of typing and the whole thing isn't worth the change. http://mail.python.org/pipermail/python-list/2012-September/thread.html#6315... Personally, I think the proposal has died a natural death, but if anyone wants to resuscitate it, I encourage them to read the above thread before doing so. -- Steven

On 2012-09-27 20:40, Jim Jewett wrote:
In the talk about how to seek to the end of the file with file.pos, it was suggested that negative positions and None could be used. I wonder whether they could be used with seek. For example: file.seek(-10) # Seek 10 bytes from the end. file.seek(None) # Seek to the end.

2012/9/28 MRAB <python@mrabarnett.plus.com>:
See the documentation: http://docs.python.org/library/io.html#io.TextIOBase.seek With text streams, SEEK_CUR and SEEK_END only accept offset=0 (i.e. no move, or go to EOF) and SEEK_SET accepts a "cookie" which was returned a previous tell(). This cookie will often look like the absolute file position, but it also has to contain the codec status, which will be nontrivial for variable-length encodings. -- Amaury Forgeot d'Arc
participants (11)
-
Amaury Forgeot d'Arc
-
Calvin Spealman
-
Devin Jeanpierre
-
Greg Ewing
-
Guido van Rossum
-
Jim Jewett
-
Joao S. O. Bueno
-
Mark Adam
-
MRAB
-
Philip Jenvey
-
Steven D'Aprano