Gustavo Niemeyer wrote:
Given the fact that files have an 'encoding' parameter, and that any unicode strings with characters not in the 0-127 range will raise an exception if being written to files, isn't it reasonable to respect the 'encoding' attribute whenever writing data to a file?
In general, files don't have an encoding parameter - sys.stdout is an exception.
That's the only case I'd like to solve.
If there are platforms that don't know how to set it, we could make the encoding attribute writable, and that would allow people to easily set it to the encoding which is deemed correct in their systems.
The reason why this works for print and not for write is that I considered "print unicodeobject" important, and wanted to implement that. file.write is an entirely different code path, so it doesn't currently consider Unicode objects; instead, it only supports strings (or, more generally, buffers).
I understand your reasoning behind it, and would like to extend your idea to the write function, allowing anyone to use the common sys.stdout idiom to implement print-like functionality (like optparse and many others). For normal files, the absence of the encoding parameter would ensure the current behavior.
This difference may become a really annoying problem when trying to internationalize programs, since it's usual to see third-party code dealing with sys.stdout, instead of using 'print'.
Apparently, it isn't important enough that somebody had analysed this, and offered a patch. In any case, it would be quite unreliable to
That's what I'm doing here! :-)
pass unicode strings to .write even *if* .write supported .encoding, since most files don't have .encoding. Even sys.stdout does not always have .encoding - only when it is a terminal, and only if we managed to find out what the encoding of the terminal is.
I think that's acceptable. The encoding parameter is meant for output streams, and Python does its best to try to find a reasonable value for showing output strings.
Thanks for your answer and clarifications,