[Python-Dev] accept the wheel PEPs 425, 426, 427
Daniel Holth
dholth at gmail.com
Fri Oct 19 05:21:53 CEST 2012
On Thu, Oct 18, 2012 at 10:55 PM, Stephen J. Turnbull
<stephen at xemacs.org> wrote:
> Executive summary:
>
> You probably should include a full ABNF grammar....
>
> Daniel Holth writes:
>
> > To support empty lines and lines with indentation with respect to
> > the RFC 822 format, any CRLF character has to be suffixed by 7 spaces
> > followed by a pipe ("|") char. [...]
> > This encoding implies that any occurences of a CRLF followed by 7 spaces
> > and a pipe char have to be replaced by a single CRLF when the field
> > is unfolded using a RFC822 reader.
>
> This isn't RFC 822 unfolding at all. An RFC 822 "reader" will simply
> remove the CRLF and optionally "canonicalize" the spaces (the latter
> is not allowed by RFC 822, but sometimes it's observed). This implies
> that if you use an RFC 822 reader, you need to replace instances of the
> regexp r"\s+\|" with a newline. (If you have a conforming reader, you
> can use the regexp r"\s{7}\|" instead.) And of course you have to
> RFC-2047-encode non-ASCII in an RFC-822 field.
>
> So please don't refer to the basic format ("field-name: field-body"
> followed by optional continuation lines) as "RFC822". "Inspired by
> RFC 822" maybe. Better "chosen to resemble the familiar RFC 822
> header format used in email and netnews." (Note that RFC 822 is
> actually ambiguous even about the basic format; section 3.4.2 implies
> that "name :body" would be an acceptable field, although section
> 3.1.2 doesn't seem to allow space before the colon. Referring to RFC
> 822 as a standard here is a bad idea. There is a reason why that
> standard gets revised/replaced periodically!)
>
> I don't understand why you specify that the newline is represented by
> CRLF *after* unfolding. Once unfolded, these fields are all what
> RFC822 would call "unstructured fields" (in that context of that RFC).
> They will contain text followed by a terminating CRLF, but including
> no others. In fact that CRLF is redundant, and may as well be
> stripped (and probably will be, in most implementations).
>
> I don't understand why you specify newline as CRLF here, except to
> pretend that you're respecting RFC 822. But all you're using are the
> division of a field into field-name and field-body by a colon, and the
> convention that a newline followed by folding whitespace is a
> continuation line. These are both trivial to implement, and almost
> all implementations will undoubtedly read the file as *text* in
> universal newline mode. I see no reason to specify a binary format.
>
> > Author-email (optional)
> > :::::::::::::::::::::::
> >
> > A string containing the author's e-mail address. It can contain
> > a name and e-mail address in the legal forms for a RFC-822
> > ``From:`` header.
>
> Heavens above, no! From RFC 822, this:
>
> Wilt . (the Stilt) Chamberlain at NBA.US
>
> is a legal email address, which probably would be represented
> conventionally as
>
> "Wilt (the Stilt) Chamberlain" <Wilt.Chamberlain at NBA.US>
>
> However, it's not at all clear that all mail clients, let alone just
> plain folks, will interpret the first form correctly. And there are
> worse examples given in that RFC. Is there a reason why you can't
> require these to be in the form recommended by RFC 5322 (ie, the
> "conventional representation" above)? Or you could relax this so that
> the quotes are prohibited.
>
> > License (optional)
> > ::::::::::::::::::
> >
> > Text indicating the license covering the distribution where the license
> > is not a selection from the "License" Trove classifiers. See
> > "Classifier" below. This field may also be used to specify a
> > particular version of a licencse which is named via the ``Classifier``
> A
> typo----------------------------+
>
> > field, or to indicate a variation or exception to such a license.
>
> This won't do as is. It doesn't exclude the possibility of including
> a complete license, and if that is intentional, this field needs to be
> in the same format as "Distribution". Licenses are complex documents,
> needing at least some of the power of something like ReST. You may as
> well give them all of it.
>
> > Project-URL (multiple-use)
> > Provides-Extra (multiple use)
>
> Hyphen or no hyphen? Consistency is good.
I will include or remove the hyphen.
Your other comments are also true of the predecessor Metadata 1.2.
The | folding discussion could probably die. Personally I do not
respect RFC822 at all (in this format). I rather expect the pragmatic
implementer to more or less [line.split(':', 1) for line in
open('METADATA') if line[0].isalpha()]. The fields that matter at
runtime (Name, Version, Requires-Dist, Provides-Extra) are all
single-line only. Basically everything else is a curiosity for the
human reader.
The .dist-info (PEP 376) or the wheel spec should gain a well-known
file package-1.0.dist-info/LICENSE. Many open source licenses require
that you include the license with every copy of the program.
Thanks,
Daniel Holth
More information about the Python-Dev
mailing list