Finn Bock wrote:
I consider the above PEP ready for review by the developers. Please comment.
The pep seems to dictate that the source by default must be read as latin-1:
""" Python will default to Latin-1 as standard encoding if no other encoding hints are given. """
Jython already reads the python source with the default java encoding which usually depends on the PCs locale.
If a small loophole could be added to that requirement, then the pep have my full support.
Hmm, in phase two we will need to decode the source code file using some encoding into Unicode and then reencode the 8-bit string parts using that same encoding. The only requirement we have for that is round-trip safety, so that string literals turn out as the same value you see in the source file.
Now, Unicode literals are explicit about this: unicode-escape is a latin-1 codec with some escaping knowledge. I'm not sure how to get this in line with the "any round-trip safe encoding" strategy...
OTOH, if Jython users write source code which depends on the PC's locale then they are bound to write non-portable code, so fixing one encoding would certainly help here.
What I don't understand is why you read the file using the PC's locale. Wouldn't it be possible to set the file encoding prior to reading from it ?
-- Marc-Andre Lemburg CEO eGenix.com Software GmbH