[Distutils] FINAL DRAFT: Dependency specifier PEP

Robert Collins robertc at robertcollins.net
Tue Nov 17 15:02:31 EST 2015

On 18 November 2015 at 08:53, Paul Moore <p.f.moore at gmail.com> wrote:
> On 17 November 2015 at 18:43, Robert Collins <robertc at robertcollins.net> wrote:
>>> By including the URL syntax, we're mandating that conforming
>>> implementations *have* to trap malformed URLs early, and can't defer
>>> that validation to the URL library being used to process the URL.
>> I don't understand how we're mandating that.
> urlspec       = '@' wsp* <URI_reference>
> combined with
>  URI_reference = <URI | relative_ref>
>  URI           = scheme ':' hier_part ('?' query )? ( '#' fragment)?
> (etc)
> implies that conforming parsers have to validate that what follows '@'
> must conform to the URI definition. So they have to reject @:::::
> because ::::: is not a valid URI. But why bother? It's extra work, and
> given that all an implementation will ever do with the URI_reference
> is pass it to a function that treats it as a URI, and that function
> will do all the validation you need.
> I'd argue that the spec can simply say
> URI_reference = <string with no whitespace>
> The discussion of how a urlspec is used can point out that the string
> will be assumed to be a URI.
> A library that parsed any non-whitespace string as a URI_reference
> would be just as useful for all practical purposes, and much easier to
> write (and test!) But it would technically be non-conformant to this
> PEP.
> Personally, I don't actually care all that much, as I probably won't
> ever write a library that implements this spec. The packaging library
> will be fine for me. But given that the point of writing the
> interoperability PEPs is to ensure people *can* write alternative
> implementations, I'm against adding complexity and implementation
> burden that has no practical benefit.

I'm still struggling to understand.

I see two angles; the first is on what is accepted or not by an implementation:
The reference here is not the implementation - its a *reference*. An
implementation whose URI handling can't handle std-66 URI's that
another ones can would lead to interop issues : and thats what we're
trying to avoid. An alternative implementation whose URI handling has
some extension that means it handles things that other implementations
don't would accept everything the PEP mandates but also accept more -
leading to interop issues. Some interop issues (e.g. pip handles
git+https:// urls, setuptools doesn't) are not covered yet, but thats
a pep-440 issue (at least, the way things are split up today) - so I
don't want to dive into that.

The second is on whether the implementation achieves that acceptance
up front in its parsing, or on the backend in its URI library. And I
could care way less which way around it does it. We're not defining
implementation, but we are defining the language.

As I understand it, you and Antoine are saying that the current PEP
*does* define implementation because folk can't trust their URI
library to error appropriately - and thats the bit I don't understand.
Just parse however you want as an author, and cross check against the
full grammar here in case of doubt.


Robert Collins <rbtcollins at hp.com>
Distinguished Technologist
HP Converged Cloud

More information about the Distutils-SIG mailing list