[Python-Dev] Unicode Imports
Nick Coghlan
ncoghlan at gmail.com
Sat Sep 9 19:05:36 CEST 2006
David Hopwood wrote:
> Martin v. Löwis wrote:
>> Nick Coghlan schrieb:
>>
>>> So this is taking something that *already works properly on POSIX
>>> systems* and making it work on Windows as well.
>> I doubt it does without side effects. For example, an application that
>> would go through sys.path, and encode everything with
>> sys.getfilesystemencoding() currently works, but will break if the patch
>> is applied and non-mbcs strings are put on sys.path.
>
> Huh? It won't break on any path for which it is not already broken.
>
> You seem to be saying "Paths with non-mbcs strings shouldn't work on Windows,
> because they haven't worked in the past."
I think MvL is looking at it from the point of view of consumers of the list
of strings in sys.path, such as PEP 302 importer and loader objects, and tools
like module_finder. Currently, the list of values in sys.path is limited to:
1. 8-bit strings
2. Unicode strings containing only characters which can be encoded using the
default file system encoding
For PEP 302 loaders, it is currently correct for them to take the 8-bit string
they receive and do "path.decode(sys.getfilesystemencoding())"
Kristján's patch works nicely for his application because he doesn't have to
worry about compatibility with existing loaders and utilities. The core
doesn't have that luxury.
We *might* be able to find a backwards compatible way to do it that could be
put into 2.5.x, but that is effort that could more profitably be spent
elsewhere, particularly since the state of the import system in Py3k will be
for it to be based entirely on Unicode (as GvR pointed out last time this
topic came up [1]).
Cheers,
Nick.
http://mail.python.org/pipermail/python-dev/2006-June/066225.html
--
Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia
---------------------------------------------------------------
http://www.boredomandlaziness.org
More information about the Python-Dev
mailing list