[Distutils] FINAL DRAFT: Dependency specifier PEP

R. David Murray rdmurray at bitdance.com
Tue Nov 17 16:22:03 EST 2015


On Wed, 18 Nov 2015 09:02:31 +1300, Robert Collins <robertc at robertcollins.net> wrote:
> On 18 November 2015 at 08:53, Paul Moore <p.f.moore at gmail.com> wrote:
> > On 17 November 2015 at 18:43, Robert Collins <robertc at robertcollins.net> wrote:
> >>> By including the URL syntax, we're mandating that conforming
> >>> implementations *have* to trap malformed URLs early, and can't defer
> >>> that validation to the URL library being used to process the URL.
> >>
> >> I don't understand how we're mandating that.
> >
> > urlspec       = '@' wsp* <URI_reference>
> >
> > combined with
> >
> >  URI_reference = <URI | relative_ref>
> >  URI           = scheme ':' hier_part ('?' query )? ( '#' fragment)?
> > (etc)
> >
> > implies that conforming parsers have to validate that what follows '@'
> > must conform to the URI definition. So they have to reject @:::::
> > because ::::: is not a valid URI. But why bother? It's extra work, and
> > given that all an implementation will ever do with the URI_reference
> > is pass it to a function that treats it as a URI, and that function
> > will do all the validation you need.
> >
> > I'd argue that the spec can simply say
> >
> > URI_reference = <string with no whitespace>
> >
> > The discussion of how a urlspec is used can point out that the string
> > will be assumed to be a URI.
> >
> > A library that parsed any non-whitespace string as a URI_reference
> > would be just as useful for all practical purposes, and much easier to
> > write (and test!) But it would technically be non-conformant to this
> > PEP.
> >
> > Personally, I don't actually care all that much, as I probably won't
> > ever write a library that implements this spec. The packaging library
> > will be fine for me. But given that the point of writing the
> > interoperability PEPs is to ensure people *can* write alternative
> > implementations, I'm against adding complexity and implementation
> > burden that has no practical benefit.
> 
> 
> I'm still struggling to understand.
> 
> I see two angles; the first is on what is accepted or not by an implementation:
> The reference here is not the implementation - its a *reference*. An
> implementation whose URI handling can't handle std-66 URI's that
> another ones can would lead to interop issues : and thats what we're
> trying to avoid. An alternative implementation whose URI handling has
> some extension that means it handles things that other implementations
> don't would accept everything the PEP mandates but also accept more -
> leading to interop issues. Some interop issues (e.g. pip handles
> git+https:// urls, setuptools doesn't) are not covered yet, but thats
> a pep-440 issue (at least, the way things are split up today) - so I
> don't want to dive into that.
> 
> The second is on whether the implementation achieves that acceptance
> up front in its parsing, or on the backend in its URI library. And I
> could care way less which way around it does it. We're not defining
> implementation, but we are defining the language.
> 
> As I understand it, you and Antoine are saying that the current PEP
> *does* define implementation because folk can't trust their URI
> library to error appropriately - and thats the bit I don't understand.
> Just parse however you want as an author, and cross check against the
> full grammar here in case of doubt.

OK, so it *is* the case that the PEP is mandating that a conforming
implementation has to accept valid and reject invalid URLs according
to the grammar in the PEP, but not *how* or *when* it does that (the
implementation).  So "trap malformed URLs early" is false, but "trap
malformed URLs" is true, if you want to be a conformant implementation.

--David


More information about the Distutils-SIG mailing list