PEP-0263 and default encoding
Martin v. Löwis
martin at v.loewis.de
Thu Oct 2 14:31:49 EDT 2003
bokr at oz.net (Bengt Richter) writes:
> How about letting someone in Klaus' situation be explicit in another
> way? E.g.,
> python -e iso-8859-1 the_unmarked_source.py
What would the exact meaning of this command line option be?
> Hm, I guess to be consistent you would have to have some way to pass
> -e info into any text-file-opening context, e.g., import, execfile,
> file, open, etc.
Ah, so it should probably apply only to the file passed to Python on
the command line - some people might think it would apply to all
files, though.
> In such case, you'd want a default. Maybe it could come from site.py
> with override by python -e, again with override by actual individual
> file-embedded encoding info.
This shows the problem of this approach: Now it becomes hidden in
site.py, and, as soon as you move the code to a different machine, the
problems come back.
> >1. You are have problem with existing code, and you are annoyed
> > by the warning. Just silence the warning in site.py.
> That's not the same as giving a proper encoding interpretation, is it?
> (Though in comments it wouldn't matter much).
No. However, it would restore the meaning that the code has in 2.2:
For comments and byte string literals, it would be the "as-is"
encoding; for Unicode literals, the interpretation would be latin-1.
> >2. You are writing new code, and you are annoyed by the encoding
> > declaration. Just save your code as UTF-8, using the UTF-8 BOM.
> YMMV with the editor you are using though, right?
Somewhat, yes. However, I expect that most editors which
specifically support Python also support PEP-263, sooner or later.
> Hm2, is all internal text representation going to wind up wchar at
> some point?
It appears that string literals will continue to denote byte strings
for quite some time. There is the -U option, so you can try yourself
to see the effects of string literals denoting Unicode objects.
Clearly, a byte string type has to stay in the language. Not as
clearly, there might be a need for byte string literals. A PEP to this
effect was just withdrawn.
Regards,
Martin
More information about the Python-list
mailing list