[Import-SIG] PEP 420 issue: extend_path

Thu May 10 05:31:15 CEST 2012

On Thu, May 10, 2012 at 12:22 PM, Barry Warsaw <barry at python.org> wrote:
> On May 10, 2012, at 10:55 AM, Nick Coghlan wrote:
>>I'd prefer to keep a consistent constraint of "iterable of path
>>entries" for the second value, and allow people to return a non-empty
>>iterable when they're returning a loader (since it will be ignored
>>anyway). See my proposed revision to PathFinder.find_module in my
>>other reply.
>
> I don't think the implementation should constrain the specification.  Rather,
> what makes the most sense to someone reading the PEP, or the future language
> reference?

I agree completely, but it's the decision to call len() directly on
the returned value in PathFinder.find_module and thus unnecessarily
constrain the return type where I see the implementation as driving
the specification.

That call is completely unnecessary. Remove it and call
namespace.extend_path() unconditionally and the simple "iterable of
path entries" definition works. By keeping it, the implementation is
forcing the specification to tighten the requirement from "iterable of
strings" to "sequence of strings".

The specification should also take into account what's *easiest* for
the API consumer. Forcing API users to check for None in the second
argument is just obnoxious when the spec could instead say to return
an empty iterable in this case.

> In that respect, I think it's better to define the second item as "ignored" or
> None when not-None is returned as the first element.  Requiring the return of
> an empty sequence when the value is semantically ignored makes no sense.

We can still provide advice on what a well-behaved loader *should* do,
even when it's not technically a requirement. However, I also really
dislike conditional constraints on values - I believe it leads to much
cleaner designs overall if the constraints on different elements are
orthogonal. (You can't always achieve that, but when it's both
possible and easy, as in this case, it's worth doing).

> There's also a semantic difference between returning None and returning an
> empty sequence as the second element when the first element is None.  In the
> matrix of return states, "(None, ())" means "I found some namespace portions,
> and the number of portions I found is zero" which is clearly nonsensical, and
> subtly different than "I found neither a normal package nor portions of a
> namespace package."

I think you're making up a distinction that doesn't exist. Both "None"
and "()" (or any other empty container) would mean "no portions found"
in practice, but the former requires an explicit check on the part of
the API consumer, while the latter will be naturally ignored by
ordinary iterable processing.

Consider the old API, where the only return options were a loader, a
string or None. Why introduce an arbitrary distinction between (None,
()) and (None, None), when we can simply declare the latter invalid
behaviour on the finder's part?

My proposal means there would only be two valid possible returns from
find_loader:

1. (loader, <iterable of path entries>)
2. (None, <iterable of path entries>)

In the first case, the iterable of path entries (which may be empty) is ignored.
In the latter case, the iterable of path entries (which may be empty)
is added to the prospective namespace package path.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia