[Python-Dev] accept the wheel PEPs 425, 426, 427

Daniel Holth dholth at gmail.com
Fri Oct 19 05:21:53 CEST 2012


On Thu, Oct 18, 2012 at 10:55 PM, Stephen J. Turnbull
<stephen at xemacs.org> wrote:
> Executive summary:
>
> You probably should include a full ABNF grammar....
>
> Daniel Holth writes:
>
>  > To support empty lines and lines with indentation with respect to
>  > the RFC 822 format, any CRLF character has to be suffixed by 7 spaces
>  > followed by a pipe ("|") char. [...]
>  > This encoding implies that any occurences of a CRLF followed by 7 spaces
>  > and a pipe char have to be replaced by a single CRLF when the field
>  > is unfolded using a RFC822 reader.
>
> This isn't RFC 822 unfolding at all.  An RFC 822 "reader" will simply
> remove the CRLF and optionally "canonicalize" the spaces (the latter
> is not allowed by RFC 822, but sometimes it's observed).  This implies
> that if you use an RFC 822 reader, you need to replace instances of the
> regexp r"\s+\|" with a newline.  (If you have a conforming reader, you
> can use the regexp r"\s{7}\|" instead.)  And of course you have to
> RFC-2047-encode non-ASCII in an RFC-822 field.
>
> So please don't refer to the basic format ("field-name: field-body"
> followed by optional continuation lines) as "RFC822".  "Inspired by
> RFC 822" maybe.  Better "chosen to resemble the familiar RFC 822
> header format used in email and netnews."  (Note that RFC 822 is
> actually ambiguous even about the basic format; section 3.4.2 implies
> that "name   :body" would be an acceptable field, although section
> 3.1.2 doesn't seem to allow space before the colon.  Referring to RFC
> 822 as a standard here is a bad idea.  There is a reason why that
> standard gets revised/replaced periodically!)
>
> I don't understand why you specify that the newline is represented by
> CRLF *after* unfolding.  Once unfolded, these fields are all what
> RFC822 would call "unstructured fields" (in that context of that RFC).
> They will contain text followed by a terminating CRLF, but including
> no others.  In fact that CRLF is redundant, and may as well be
> stripped (and probably will be, in most implementations).
>
> I don't understand why you specify newline as CRLF here, except to
> pretend that you're respecting RFC 822.  But all you're using are the
> division of a field into field-name and field-body by a colon, and the
> convention that a newline followed by folding whitespace is a
> continuation line.  These are both trivial to implement, and almost
> all implementations will undoubtedly read the file as *text* in
> universal newline mode.  I see no reason to specify a binary format.
>
>  > Author-email (optional)
>  > :::::::::::::::::::::::
>  >
>  > A string containing the author's e-mail address.  It can contain
>  > a name and e-mail address in the legal forms for a RFC-822
>  > ``From:`` header.
>
> Heavens above, no!  From RFC 822, this:
>
>     Wilt . (the  Stilt) Chamberlain at NBA.US
>
> is a legal email address, which probably would be represented
> conventionally as
>
>     "Wilt (the Stilt) Chamberlain" <Wilt.Chamberlain at NBA.US>
>
> However, it's not at all clear that all mail clients, let alone just
> plain folks, will interpret the first form correctly.  And there are
> worse examples given in that RFC.  Is there a reason why you can't
> require these to be in the form recommended by RFC 5322 (ie, the
> "conventional representation" above)?  Or you could relax this so that
> the quotes are prohibited.
>
>  > License (optional)
>  > ::::::::::::::::::
>  >
>  > Text indicating the license covering the distribution where the license
>  > is not a selection from the "License" Trove classifiers. See
>  > "Classifier" below.  This field may also be used to specify a
>  > particular version of a licencse which is named via the ``Classifier``
>                                 A
> typo----------------------------+
>
>  > field, or to indicate a variation or exception to such a license.
>
> This won't do as is.  It doesn't exclude the possibility of including
> a complete license, and if that is intentional, this field needs to be
> in the same format as "Distribution".  Licenses are complex documents,
> needing at least some of the power of something like ReST.  You may as
> well give them all of it.
>
>  > Project-URL (multiple-use)
>  > Provides-Extra (multiple use)
>
> Hyphen or no hyphen?  Consistency is good.

I will include or remove the hyphen.

Your other comments are also true of the predecessor Metadata 1.2.

The | folding discussion could probably die. Personally I do not
respect RFC822 at all (in this format). I rather expect the pragmatic
implementer to more or less [line.split(':', 1) for line in
open('METADATA') if line[0].isalpha()]. The fields that matter at
runtime (Name, Version, Requires-Dist, Provides-Extra) are all
single-line only. Basically everything else is a curiosity for the
human reader.

The .dist-info (PEP 376) or the wheel spec should gain a well-known
file package-1.0.dist-info/LICENSE. Many open source licenses require
that you include the license with every copy of the program.

Thanks,

Daniel Holth


More information about the Python-Dev mailing list