[Python-Dev] Import and unicode: part two

Nick Coghlan ncoghlan at gmail.com
Fri Jan 21 01:00:14 CET 2011


On Fri, Jan 21, 2011 at 5:27 AM, Toshio Kuratomi <a.badger at gmail.com> wrote:
> I think that both ideas are inferior to mandating that every python module
> filename is ascii.  From what I'm getting from Victor's posts is that he, at
> least, considers the portability problems to be ignorable because dealing
> with ambiguous file name encodings is something that he'd like to force
> third party tools to deal with.

I think you're starting from an incorrect premise: we *already* allow
non-ASCII module names in Py3k. They just don't always work properly,
hence why people are currently much, much better off using pure ASCII
for their module names (as ASCII is still the lowest common
denominator for internet communication).

However, you are proposing that, instead of attempting to fix at least
some of the cases where it doesn't work, we throw up our hands and
tell people "Since some poorly configured systems have trouble with
this feature, we're taking it away from everybody. Sorry if this
breaks your code." While there may be situations where that's a valid
approach, this isn't one of them.

Yes, non-ASCII filenames are problems for all sorts of reasons (with
Python's historically poor support being one of them). The idea is
that we're striving to no longer be part of that problem, even if it
isn't within our power to fix it entirely. Once we fix the core to
handle various Unicode issues, then over time that support can ripple
out through the rest of the Python ecosystem - we don't expect
everything to magically "just work" as soon as the basic issue in the
core is fixed. It's going to be *years* before non-ASCII file names
are as portable as pure ASCII ones (it kind of reminds me of the era
when you had to avoid spaces in filenames because so many applications
choked on them, even after the OS had been updated to support them).

As far as the question of filenames not being re-encoded properly when
copied between two systems, then yes, that *is* a problem with the
third party tools used to do the copying. Such tools will break any
code that uses the str APIs to access the filesystem.

To deal with the case of undecodable filenames that the import system
skips over, it is certainly possibly that importlib or runpy (probably
the former) could acquire a function that allowed a named file to
imported directly (with a specific module name) rather than requiring
the import system to search for it.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Python-Dev mailing list