file.seek() and file.tell() look inconsistent to me
MRAB
python at mrabarnett.plus.com
Mon Jul 4 13:29:17 EDT 2016
On 2016-07-04 16:48, Marco Buttu wrote:
> Hi all,
>
> if I open a file in text mode, do you know why file.seek() returns the
> number of bytes, and file.tell() takes the number of bytes? I was
> expecting the number of characters, like write() does:
>
> >>> f = open('myfile', 'w')
> >>> f.write('aè')
> 2
>
> It seems to me not consistent, and maybe could also be error prone:
>
> >>> f.seek(2)
> 2
> >>> f.write('c')
> 1
> >>> f.close()
> >>> open('myfile').read()
> ...
> UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc3...
>
>
Some encodings, such as UTF-8, use a variable number of bytes per
character (codepoint, actually), so in order to seek to a certain
character position you would need to read from a known position, e.g.
the start of the file, until you reached the desired place.
Most of the time you're seeking to a position that was previously
returned by tell anyway.
I think it's a case of "practicality beats purity".
More information about the Python-list
mailing list