[Python-Dev] Quick sum up about open() + BOM
Victor Stinner
victor.stinner at haypocalc.com
Sat Jan 9 13:37:06 CET 2010
Le samedi 09 janvier 2010 02:23:07, Martin v. Löwis a écrit :
> While I would support combining BOM detection in the case where a file
> is opened for reading and no encoding is specified, I see two problems:
> a) if a seek operations is performed before having looked at the BOM,
> no determination would have been made
TextIOWrapper doesn't support seek to an arbitrary byte. It uses "cookie"
which is an opaque value. Reuse a cookie from another file or an old cookie is
forbidden (but it doesn't raise an error). This is not specific to the BOM
checking: the problem already exist for encodings using a BOM (eg. UTF-16).
> b) what encoding should it use on writing?
Don't change anything to writing.
With Antoince choice: open('file.txt', 'w', encoding=None) continue to use the
actual heuristic (os.device_encoding() or system locale).
With Guido choice, encoding="BOM": it raises an error, because BOM check is
not supported when writing into a file. How could the BOM be checked when
creating a new (empty) file!?
--
Victor Stinner
http://www.haypocalc.com/
More information about the Python-Dev
mailing list