[Python-Dev] accept the wheel PEPs 425, 426, 427

Stephen J. Turnbull stephen at xemacs.org
Fri Oct 19 04:55:58 CEST 2012


Executive summary:

You probably should include a full ABNF grammar....

Daniel Holth writes:

 > To support empty lines and lines with indentation with respect to
 > the RFC 822 format, any CRLF character has to be suffixed by 7 spaces
 > followed by a pipe ("|") char. [...]
 > This encoding implies that any occurences of a CRLF followed by 7 spaces
 > and a pipe char have to be replaced by a single CRLF when the field
 > is unfolded using a RFC822 reader.

This isn't RFC 822 unfolding at all.  An RFC 822 "reader" will simply
remove the CRLF and optionally "canonicalize" the spaces (the latter
is not allowed by RFC 822, but sometimes it's observed).  This implies
that if you use an RFC 822 reader, you need to replace instances of the
regexp r"\s+\|" with a newline.  (If you have a conforming reader, you
can use the regexp r"\s{7}\|" instead.)  And of course you have to
RFC-2047-encode non-ASCII in an RFC-822 field.

So please don't refer to the basic format ("field-name: field-body"
followed by optional continuation lines) as "RFC822".  "Inspired by
RFC 822" maybe.  Better "chosen to resemble the familiar RFC 822
header format used in email and netnews."  (Note that RFC 822 is
actually ambiguous even about the basic format; section 3.4.2 implies
that "name   :body" would be an acceptable field, although section
3.1.2 doesn't seem to allow space before the colon.  Referring to RFC
822 as a standard here is a bad idea.  There is a reason why that
standard gets revised/replaced periodically!)

I don't understand why you specify that the newline is represented by
CRLF *after* unfolding.  Once unfolded, these fields are all what
RFC822 would call "unstructured fields" (in that context of that RFC).
They will contain text followed by a terminating CRLF, but including
no others.  In fact that CRLF is redundant, and may as well be
stripped (and probably will be, in most implementations).

I don't understand why you specify newline as CRLF here, except to
pretend that you're respecting RFC 822.  But all you're using are the
division of a field into field-name and field-body by a colon, and the
convention that a newline followed by folding whitespace is a
continuation line.  These are both trivial to implement, and almost
all implementations will undoubtedly read the file as *text* in
universal newline mode.  I see no reason to specify a binary format.

 > Author-email (optional)
 > :::::::::::::::::::::::
 > 
 > A string containing the author's e-mail address.  It can contain
 > a name and e-mail address in the legal forms for a RFC-822
 > ``From:`` header.

Heavens above, no!  From RFC 822, this:

    Wilt . (the  Stilt) Chamberlain at NBA.US

is a legal email address, which probably would be represented
conventionally as

    "Wilt (the Stilt) Chamberlain" <Wilt.Chamberlain at NBA.US>

However, it's not at all clear that all mail clients, let alone just
plain folks, will interpret the first form correctly.  And there are
worse examples given in that RFC.  Is there a reason why you can't
require these to be in the form recommended by RFC 5322 (ie, the
"conventional representation" above)?  Or you could relax this so that
the quotes are prohibited.

 > License (optional)
 > ::::::::::::::::::
 > 
 > Text indicating the license covering the distribution where the license
 > is not a selection from the "License" Trove classifiers. See
 > "Classifier" below.  This field may also be used to specify a
 > particular version of a licencse which is named via the ``Classifier``
                                A
typo----------------------------+

 > field, or to indicate a variation or exception to such a license.

This won't do as is.  It doesn't exclude the possibility of including
a complete license, and if that is intentional, this field needs to be
in the same format as "Distribution".  Licenses are complex documents,
needing at least some of the power of something like ReST.  You may as
well give them all of it.

 > Project-URL (multiple-use)
 > Provides-Extra (multiple use)

Hyphen or no hyphen?  Consistency is good.



More information about the Python-Dev mailing list