Mailman 3 April 2018 - Python-Dev

PEP 448 review
by Guido van Rossum March 29, 2023

March 29, 2023

I'm back, I've re-read the PEP, and I've re-read the long thread with "(no subject)". I think Georg Brandl nailed it: """ *I like the "sequence and dict flattening" part of the PEP, mostly because itis consistent and should be easy to understand, but the comprehension syntaxenhancements seem to be bad for readability and "comprehending" what the codedoes.The call syntax part is a mixed bag on the one hand it is nice to be consistent with the extended possibilities in literals (… [View More]flattening), but on the other hand there would be small but annoying inconsistencies anyways (e.g. the duplicate kwarg case above).* """ Greg Ewing followed up explaining that the inconsistency between dict flattening and call syntax is inherent in the pre-existing different rules for dicts vs. keyword args: {'a':1, 'a':2} results in {'a':2}, while f(a=1, a=2) is an error. (This form is a SyntaxError; the dynamic case f(a=1, **{'a': 1}) is a TypeError.) For me, allowing f(*a, *b) and f(**d, **e) and all the other combinations for function calls proposed by the PEP is an easy +1 -- it's a straightforward extension of the existing pattern, and anybody who knows what f(x, *a) does will understand f(x, *a, y, *b). Guessing what f(**d, **e) means shouldn't be hard either. Understanding the edge case for duplicate keys with f(**d, **e) is a little harder, but the error messages are pretty clear, and it is not a new edge case. The sequence and dict flattening syntax proposals are also clean and logical -- we already have *-unpacking on the receiving side, so allowing *x in tuple expressions reads pretty naturally (and the similarity with *a in argument lists certainly helps). From here, having [a, *x, b, *y] is also natural, and then the extension to other displays is natural: {a, *x, b, *y} and {a:1, **d, b:2, **e}. This, too, gets a +1 from me. So that leaves comprehensions. IIRC, during the development of the patch we realized that f(*x for x in xs) is sufficiently ambiguous that we decided to disallow it -- note that f(x for x in xs) is already somewhat of a special case because an argument can only be a "bare" generator expression if it is the only argument. The same reasoning doesn't apply (in that form) to list, set and dict comprehensions -- while f(x for x in xs) is identical in meaning to f((x for x in xs)), [x for x in xs] is NOT the same as [(x for x in xs)] (that's a list of one element, and the element is a generator expression). The basic premise of this part of the proposal is that if you have a few iterables, the new proposal (without comprehensions) lets you create a list or generator expression that iterates over all of them, essentially flattening them: >>> xs = [1, 2, 3] >>> ys = ['abc', 'def'] >>> zs = [99] >>> [*xs, *ys, *zs] [1, 2, 3, 'abc', 'def', 99] >>> But now suppose you have a list of iterables: >>> xss = [[1, 2, 3], ['abc', 'def'], [99]] >>> [*xss[0], *xss[1], *xss[2]] [1, 2, 3, 'abc', 'def', 99] >>> Wouldn't it be nice if you could write the latter using a comprehension? >>> xss = [[1, 2, 3], ['abc', 'def'], [99]] >>> [*xs for xs in xss] [1, 2, 3, 'abc', 'def', 99] >>> This is somewhat seductive, and the following is even nicer: the *xs position may be an expression, e.g.: >>> xss = [[1, 2, 3], ['abc', 'def'], [99]] >>> [*xs[:2] for xs in xss] [1, 2, 'abc', 'def', 99] >>> On the other hand, I had to explore the possibilities here by experimenting in the interpreter, and I discovered some odd edge cases (e.g. you can parenthesize the starred expression, but that seems a syntactic accident). All in all I am personally +0 on the comprehension part of the PEP, and I like that it provides a way to "flatten" a sequence of sequences, but I think very few people in the thread have supported this part. Therefore I would like to ask Neil to update the PEP and the patch to take out the comprehension part, so that the two "easy wins" can make it into Python 3.5 (basically, I am accepting two-thirds of the PEP :-). There is some time yet until alpha 2. I would also like code reviewers (Benjamin?) to start reviewing the patch <http://bugs.python.org/issue2292>, taking into account that the comprehension part needs to be removed. -- --Guido van Rossum (python.org/~guido) [View Less]

7 17

PEP 1, PEP Purpose and Guidelines
by barry＠zope.com May 18, 2021

May 18, 2021

It has been a while since I posted a copy of PEP 1 to the mailing lists and newsgroups. I've recently done some updating of a few sections, so in the interest of gaining wider community participation in the Python development process, I'm posting the latest revision of PEP 1 here. A version of the PEP is always available on-line at http://www.python.org/peps/pep-0001.html Enjoy, -Barry -------------------- snip snip -------------------- PEP: 1 Title: PEP Purpose and Guidelines Version: … [View More]$Revision: 1.36 $ Last-Modified: $Date: 2002/07/29 18:34:59 $ Author: Barry A. Warsaw, Jeremy Hylton Status: Active Type: Informational Created: 13-Jun-2000 Post-History: 21-Mar-2001, 29-Jul-2002 What is a PEP? PEP stands for Python Enhancement Proposal. A PEP is a design document providing information to the Python community, or describing a new feature for Python. The PEP should provide a concise technical specification of the feature and a rationale for the feature. We intend PEPs to be the primary mechanisms for proposing new features, for collecting community input on an issue, and for documenting the design decisions that have gone into Python. The PEP author is responsible for building consensus within the community and documenting dissenting opinions. Because the PEPs are maintained as plain text files under CVS control, their revision history is the historical record of the feature proposal[1]. Kinds of PEPs There are two kinds of PEPs. A standards track PEP describes a new feature or implementation for Python. An informational PEP describes a Python design issue, or provides general guidelines or information to the Python community, but does not propose a new feature. Informational PEPs do not necessarily represent a Python community consensus or recommendation, so users and implementors are free to ignore informational PEPs or follow their advice. PEP Work Flow The PEP editor, Barry Warsaw <peps(a)python.org>, assigns numbers for each PEP and changes its status. The PEP process begins with a new idea for Python. It is highly recommended that a single PEP contain a single key proposal or new idea. The more focussed the PEP, the more successfully it tends to be. The PEP editor reserves the right to reject PEP proposals if they appear too unfocussed or too broad. If in doubt, split your PEP into several well-focussed ones. Each PEP must have a champion -- someone who writes the PEP using the style and format described below, shepherds the discussions in the appropriate forums, and attempts to build community consensus around the idea. The PEP champion (a.k.a. Author) should first attempt to ascertain whether the idea is PEP-able. Small enhancements or patches often don't need a PEP and can be injected into the Python development work flow with a patch submission to the SourceForge patch manager[2] or feature request tracker[3]. The PEP champion then emails the PEP editor <peps(a)python.org> with a proposed title and a rough, but fleshed out, draft of the PEP. This draft must be written in PEP style as described below. If the PEP editor approves, he will assign the PEP a number, label it as standards track or informational, give it status 'draft', and create and check-in the initial draft of the PEP. The PEP editor will not unreasonably deny a PEP. Reasons for denying PEP status include duplication of effort, being technically unsound, not providing proper motivation or addressing backwards compatibility, or not in keeping with the Python philosophy. The BDFL (Benevolent Dictator for Life, Guido van Rossum) can be consulted during the approval phase, and is the final arbitrator of the draft's PEP-ability. If a pre-PEP is rejected, the author may elect to take the pre-PEP to the comp.lang.python newsgroup (a.k.a. python-list(a)python.org mailing list) to help flesh it out, gain feedback and consensus from the community at large, and improve the PEP for re-submission. The author of the PEP is then responsible for posting the PEP to the community forums, and marshaling community support for it. As updates are necessary, the PEP author can check in new versions if they have CVS commit permissions, or can email new PEP versions to the PEP editor for committing. Standards track PEPs consists of two parts, a design document and a reference implementation. The PEP should be reviewed and accepted before a reference implementation is begun, unless a reference implementation will aid people in studying the PEP. Standards Track PEPs must include an implementation - in the form of code, patch, or URL to same - before it can be considered Final. PEP authors are responsible for collecting community feedback on a PEP before submitting it for review. A PEP that has not been discussed on python-list(a)python.org and/or python-dev(a)python.org will not be accepted. However, wherever possible, long open-ended discussions on public mailing lists should be avoided. Strategies to keep the discussions efficient include, setting up a separate SIG mailing list for the topic, having the PEP author accept private comments in the early design phases, etc. PEP authors should use their discretion here. Once the authors have completed a PEP, they must inform the PEP editor that it is ready for review. PEPs are reviewed by the BDFL and his chosen consultants, who may accept or reject a PEP or send it back to the author(s) for revision. Once a PEP has been accepted, the reference implementation must be completed. When the reference implementation is complete and accepted by the BDFL, the status will be changed to `Final.' A PEP can also be assigned status `Deferred.' The PEP author or editor can assign the PEP this status when no progress is being made on the PEP. Once a PEP is deferred, the PEP editor can re-assign it to draft status. A PEP can also be `Rejected'. Perhaps after all is said and done it was not a good idea. It is still important to have a record of this fact. PEPs can also be replaced by a different PEP, rendering the original obsolete. This is intended for Informational PEPs, where version 2 of an API can replace version 1. PEP work flow is as follows: Draft -> Accepted -> Final -> Replaced ^ +----> Rejected v Deferred Some informational PEPs may also have a status of `Active' if they are never meant to be completed. E.g. PEP 1. What belongs in a successful PEP? Each PEP should have the following parts: 1. Preamble -- RFC822 style headers containing meta-data about the PEP, including the PEP number, a short descriptive title (limited to a maximum of 44 characters), the names, and optionally the contact info for each author, etc. 2. Abstract -- a short (~200 word) description of the technical issue being addressed. 3. Copyright/public domain -- Each PEP must either be explicitly labelled as placed in the public domain (see this PEP as an example) or licensed under the Open Publication License[4]. 4. Specification -- The technical specification should describe the syntax and semantics of any new language feature. The specification should be detailed enough to allow competing, interoperable implementations for any of the current Python platforms (CPython, JPython, Python .NET). 5. Motivation -- The motivation is critical for PEPs that want to change the Python language. It should clearly explain why the existing language specification is inadequate to address the problem that the PEP solves. PEP submissions without sufficient motivation may be rejected outright. 6. Rationale -- The rationale fleshes out the specification by describing what motivated the design and why particular design decisions were made. It should describe alternate designs that were considered and related work, e.g. how the feature is supported in other languages. The rationale should provide evidence of consensus within the community and discuss important objections or concerns raised during discussion. 7. Backwards Compatibility -- All PEPs that introduce backwards incompatibilities must include a section describing these incompatibilities and their severity. The PEP must explain how the author proposes to deal with these incompatibilities. PEP submissions without a sufficient backwards compatibility treatise may be rejected outright. 8. Reference Implementation -- The reference implementation must be completed before any PEP is given status 'Final,' but it need not be completed before the PEP is accepted. It is better to finish the specification and rationale first and reach consensus on it before writing code. The final implementation must include test code and documentation appropriate for either the Python language reference or the standard library reference. PEP Template PEPs are written in plain ASCII text, and should adhere to a rigid style. There is a Python script that parses this style and converts the plain text PEP to HTML for viewing on the web[5]. PEP 9 contains a boilerplate[7] template you can use to get started writing your PEP. Each PEP must begin with an RFC822 style header preamble. The headers must appear in the following order. Headers marked with `*' are optional and are described below. All other headers are required. PEP: <pep number> Title: <pep title> Version: <cvs version string> Last-Modified: <cvs date string> Author: <list of authors' real names and optionally, email addrs> * Discussions-To: <email address> Status: <Draft | Active | Accepted | Deferred | Final | Replaced> Type: <Informational | Standards Track> * Requires: <pep numbers> Created: <date created on, in dd-mmm-yyyy format> * Python-Version: <version number> Post-History: <dates of postings to python-list and python-dev> * Replaces: <pep number> * Replaced-By: <pep number> The Author: header lists the names and optionally, the email addresses of all the authors/owners of the PEP. The format of the author entry should be address(a)dom.ain (Random J. User) if the email address is included, and just Random J. User if the address is not given. If there are multiple authors, each should be on a separate line following RFC 822 continuation line conventions. Note that personal email addresses in PEPs will be obscured as a defense against spam harvesters. Standards track PEPs must have a Python-Version: header which indicates the version of Python that the feature will be released with. Informational PEPs do not need a Python-Version: header. While a PEP is in private discussions (usually during the initial Draft phase), a Discussions-To: header will indicate the mailing list or URL where the PEP is being discussed. No Discussions-To: header is necessary if the PEP is being discussed privately with the author, or on the python-list or python-dev email mailing lists. Note that email addresses in the Discussions-To: header will not be obscured. Created: records the date that the PEP was assigned a number, while Post-History: is used to record the dates of when new versions of the PEP are posted to python-list and/or python-dev. Both headers should be in dd-mmm-yyyy format, e.g. 14-Aug-2001. PEPs may have a Requires: header, indicating the PEP numbers that this PEP depends on. PEPs may also have a Replaced-By: header indicating that a PEP has been rendered obsolete by a later document; the value is the number of the PEP that replaces the current document. The newer PEP must have a Replaces: header containing the number of the PEP that it rendered obsolete. PEP Formatting Requirements PEP headings must begin in column zero and the initial letter of each word must be capitalized as in book titles. Acronyms should be in all capitals. The body of each section must be indented 4 spaces. Code samples inside body sections should be indented a further 4 spaces, and other indentation can be used as required to make the text readable. You must use two blank lines between the last line of a section's body and the next section heading. You must adhere to the Emacs convention of adding two spaces at the end of every sentence. You should fill your paragraphs to column 70, but under no circumstances should your lines extend past column 79. If your code samples spill over column 79, you should rewrite them. Tab characters must never appear in the document at all. A PEP should include the standard Emacs stanza included by example at the bottom of this PEP. A PEP must contain a Copyright section, and it is strongly recommended to put the PEP in the public domain. When referencing an external web page in the body of a PEP, you should include the title of the page in the text, with a footnote reference to the URL. Do not include the URL in the body text of the PEP. E.g. Refer to the Python Language web site [1] for more details. ... [1] http://www.python.org When referring to another PEP, include the PEP number in the body text, such as "PEP 1". The title may optionally appear. Add a footnote reference that includes the PEP's title and author. It may optionally include the explicit URL on a separate line, but only in the References section. Note that the pep2html.py script will calculate URLs automatically, e.g.: ... Refer to PEP 1 [7] for more information about PEP style ... References [7] PEP 1, PEP Purpose and Guidelines, Warsaw, Hylton http://www.python.org/peps/pep-0001.html If you decide to provide an explicit URL for a PEP, please use this as the URL template: http://www.python.org/peps/pep-xxxx.html PEP numbers in URLs must be padded with zeros from the left, so as to be exactly 4 characters wide, however PEP numbers in text are never padded. Reporting PEP Bugs, or Submitting PEP Updates How you report a bug, or submit a PEP update depends on several factors, such as the maturity of the PEP, the preferences of the PEP author, and the nature of your comments. For the early draft stages of the PEP, it's probably best to send your comments and changes directly to the PEP author. For more mature, or finished PEPs you may want to submit corrections to the SourceForge bug manager[6] or better yet, the SourceForge patch manager[2] so that your changes don't get lost. If the PEP author is a SF developer, assign the bug/patch to him, otherwise assign it to the PEP editor. When in doubt about where to send your changes, please check first with the PEP author and/or PEP editor. PEP authors who are also SF committers, can update the PEPs themselves by using "cvs commit" to commit their changes. Remember to also push the formatted PEP text out to the web by doing the following: % python pep2html.py -i NUM where NUM is the number of the PEP you want to push out. See % python pep2html.py --help for details. Transferring PEP Ownership It occasionally becomes necessary to transfer ownership of PEPs to a new champion. In general, we'd like to retain the original author as a co-author of the transferred PEP, but that's really up to the original author. A good reason to transfer ownership is because the original author no longer has the time or interest in updating it or following through with the PEP process, or has fallen off the face of the 'net (i.e. is unreachable or not responding to email). A bad reason to transfer ownership is because you don't agree with the direction of the PEP. We try to build consensus around a PEP, but if that's not possible, you can always submit a competing PEP. If you are interested assuming ownership of a PEP, send a message asking to take over, addressed to both the original author and the PEP editor <peps(a)python.org>. If the original author doesn't respond to email in a timely manner, the PEP editor will make a unilateral decision (it's not like such decisions can be reversed. :). References and Footnotes [1] This historical record is available by the normal CVS commands for retrieving older revisions. For those without direct access to the CVS tree, you can browse the current and past PEP revisions via the SourceForge web site at http://cvs.sourceforge.net/cgi-bin/cvsweb.cgi/python/nondist/peps/?cvsroot=… [2] http://sourceforge.net/tracker/?group_id=5470&atid=305470 [3] http://sourceforge.net/tracker/?atid=355470&group_id=5470&func=browse [4] http://www.opencontent.org/openpub/ [5] The script referred to here is pep2html.py, which lives in the same directory in the CVS tree as the PEPs themselves. Try "pep2html.py --help" for details. The URL for viewing PEPs on the web is http://www.python.org/peps/ [6] http://sourceforge.net/tracker/?group_id=5470&atid=305470 [7] PEP 9, Sample PEP Template http://www.python.org/peps/pep-0009.html Copyright This document has been placed in the public domain. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 End: [View Less]

8 14

Boundaries between numbers and identifiers
by Serhiy Storchaka April 15, 2021

April 15, 2021

In Python 2.5 `0or[]` was accepted by the Python parser. It became an error in 2.6 because "0o" became recognizing as an incomplete octal number. `1or[]` still is accepted. On other hand, `1if 2else 3` is accepted despites the fact that "2e" can be recognized as an incomplete floating point number. In this case the tokenizer pushes "e" back and returns "2". Shouldn't it do the same with "0o"? It is possible to make `0or[]` be parseable again. Python implementation is able to tokenize … [View More]

8 10

congrats on 3.5! Alas, windows 7 users are having problems installing it
by Laura Creighton Sept. 15, 2019

Sept. 15, 2019

webmaster has already heard from 4 people who cannot install it. I sent them to the bug tracker or to python-list but they seem not to have gone either place. Is there some guide I should be sending them to, 'how to debug installation problems'? Laura

5 6

PEP 574 -- Pickle protocol 5 with out-of-band data
by Antoine Pitrou Nov. 5, 2018

Nov. 5, 2018

Hi, I'd like to submit this PEP for discussion. It is quite specialized and the main target audience of the proposed changes is users and authors of applications/libraries transferring large amounts of data (read: the scientific computing & data science ecosystems). https://www.python.org/dev/peps/pep-0574/ The PEP text is also inlined below. Regards Antoine. PEP: 574 Title: Pickle protocol 5 with out-of-band data Version: $Revision$ Last-Modified: $Date$ Author: Antoine Pitrou … [View More]<solipsis(a)pitrou.net> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 23-Mar-2018 Post-History: Resolution: Abstract ======== This PEP proposes to standardize a new pickle protocol version, and accompanying APIs to take full advantage of it: 1. A new pickle protocol version (5) to cover the extra metadata needed for out-of-band data buffers. 2. A new ``PickleBuffer`` type for ``__reduce_ex__`` implementations to return out-of-band data buffers. 3. A new ``buffer_callback`` parameter when pickling, to handle out-of-band data buffers. 4. A new ``buffers`` parameter when unpickling to provide out-of-band data buffers. The PEP guarantees unchanged behaviour for anyone not using the new APIs. Rationale ========= The pickle protocol was originally designed in 1995 for on-disk persistency of arbitrary Python objects. The performance of a 1995-era storage medium probably made it irrelevant to focus on performance metrics such as use of RAM bandwidth when copying temporary data before writing it to disk. Nowadays the pickle protocol sees a growing use in applications where most of the data isn't ever persisted to disk (or, when it is, it uses a portable format instead of Python-specific). Instead, pickle is being used to transmit data and commands from one process to another, either on the same machine or on multiple machines. Those applications will sometimes deal with very large data (such as Numpy arrays or Pandas dataframes) that need to be transferred around. For those applications, pickle is currently wasteful as it imposes spurious memory copies of the data being serialized. As a matter of fact, the standard ``multiprocessing`` module uses pickle for serialization, and therefore also suffers from this problem when sending large data to another process. Third-party Python libraries, such as Dask [#dask]_, PyArrow [#pyarrow]_ and IPyParallel [#ipyparallel]_, have started implementing alternative serialization schemes with the explicit goal of avoiding copies on large data. Implementing a new serialization scheme is difficult and often leads to reduced generality (since many Python objects support pickle but not the new serialization scheme). Falling back on pickle for unsupported types is an option, but then you get back the spurious memory copies you wanted to avoid in the first place. For example, ``dask`` is able to avoid memory copies for Numpy arrays and built-in containers thereof (such as lists or dicts containing Numpy arrays), but if a large Numpy array is an attribute of a user-defined object, ``dask`` will serialize the user-defined object as a pickle stream, leading to memory copies. The common theme of these third-party serialization efforts is to generate a stream of object metadata (which contains pickle-like information about the objects being serialized) and a separate stream of zero-copy buffer objects for the payloads of large objects. Note that, in this scheme, small objects such as ints, etc. can be dumped together with the metadata stream. Refinements can include opportunistic compression of large data depending on its type and layout, like ``dask`` does. This PEP aims to make ``pickle`` usable in a way where large data is handled as a separate stream of zero-copy buffers, letting the application handle those buffers optimally. Example ======= To keep the example simple and avoid requiring knowledge of third-party libraries, we will focus here on a bytearray object (but the issue is conceptually the same with more sophisticated objects such as Numpy arrays). Like most objects, the bytearray object isn't immediately understood by the pickle module and must therefore specify its decomposition scheme. Here is how a bytearray object currently decomposes for pickling:: >>> b.__reduce_ex__(4) (<class 'bytearray'>, (b'abc',), None) This is because the ``bytearray.__reduce_ex__`` implementation reads morally as follows:: class bytearray: def __reduce_ex__(self, protocol): if protocol == 4: return type(self), bytes(self), None # Legacy code for earlier protocols omitted In turn it produces the following pickle code:: >>> pickletools.dis(pickletools.optimize(pickle.dumps(b, protocol=4))) 0: \x80 PROTO 4 2: \x95 FRAME 30 11: \x8c SHORT_BINUNICODE 'builtins' 21: \x8c SHORT_BINUNICODE 'bytearray' 32: \x93 STACK_GLOBAL 33: C SHORT_BINBYTES b'abc' 38: \x85 TUPLE1 39: R REDUCE 40: . STOP (the call to ``pickletools.optimize`` above is only meant to make the pickle stream more readable by removing the MEMOIZE opcodes) We can notice several things about the bytearray's payload (the sequence of bytes ``b'abc'``): * ``bytearray.__reduce_ex__`` produces a first copy by instantiating a new bytes object from the bytearray's data. * ``pickle.dumps`` produces a second copy when inserting the contents of that bytes object into the pickle stream, after the SHORT_BINBYTES opcode. * Furthermore, when deserializing the pickle stream, a temporary bytes object is created when the SHORT_BINBYTES opcode is encountered (inducing a data copy). What we really want is something like the following: * ``bytearray.__reduce_ex__`` produces a *view* of the bytearray's data. * ``pickle.dumps`` doesn't try to copy that data into the pickle stream but instead passes the buffer view to its caller (which can decide on the most efficient handling of that buffer). * When deserializing, ``pickle.loads`` takes the pickle stream and the buffer view separately, and passes the buffer view directly to the bytearray constructor. We see that several conditions are required for the above to work: * ``__reduce__`` or ``__reduce_ex__`` must be able to return *something* that indicates a serializable no-copy buffer view. * The pickle protocol must be able to represent references to such buffer views, instructing the unpickler that it may have to get the actual buffer out of band. * The ``pickle.Pickler`` API must provide its caller with a way to receive such buffer views while serializing. * The ``pickle.Unpickler`` API must similarly allow its caller to provide the buffer views required for deserialization. * For compatibility, the pickle protocol must also be able to contain direct serializations of such buffer views, such that current uses of the ``pickle`` API don't have to be modified if they are not concerned with memory copies. Producer API ============ We are introducing a new type ``pickle.PickleBuffer`` which can be instantiated from any buffer-supporting object, and is specifically meant to be returned from ``__reduce__`` implementations:: class bytearray: def __reduce_ex__(self, protocol): if protocol == 5: return type(self), PickleBuffer(self), None # Legacy code for earlier protocols omitted ``PickleBuffer`` is a simple wrapper that doesn't have all the memoryview semantics and functionality, but is specifically recognized by the ``pickle`` module if protocol 5 or higher is enabled. It is an error to try to serialize a ``PickleBuffer`` with pickle protocol version 4 or earlier. Only the raw *data* of the ``PickleBuffer`` will be considered by the ``pickle`` module. Any type-specific *metadata* (such as shapes or datatype) must be returned separately by the type's ``__reduce__`` implementation, as is already the case. PickleBuffer objects -------------------- The ``PickleBuffer`` class supports a very simple Python API. Its constructor takes a single PEP 3118-compatible object [#pep-3118]_. ``PickleBuffer`` objects themselves support the buffer protocol, so consumers can call ``memoryview(...)`` on them to get additional information about the underlying buffer (such as the original type, shape, etc.). On the C side, a simple API will be provided to create and inspect PickleBuffer objects: ``PyObject *PyPickleBuffer_FromObject(PyObject *obj)`` Create a ``PickleBuffer`` object holding a view over the PEP 3118-compatible *obj*. ``PyPickleBuffer_Check(PyObject *obj)`` Return whether *obj* is a ``PickleBuffer`` instance. ``const Py_buffer *PyPickleBuffer_GetBuffer(PyObject *picklebuf)`` Return a pointer to the internal ``Py_buffer`` owned by the ``PickleBuffer`` instance. ``PickleBuffer`` can wrap any kind of buffer, including non-contiguous buffers. It's up to consumers to decide how best to handle different kinds of buffers (for example, some consumers may find it acceptable to make a contiguous copy of non-contiguous buffers). Consumer API ============ ``pickle.Pickler.__init__`` and ``pickle.dumps`` are augmented with an additional ``buffer_callback`` parameter:: class Pickler: def __init__(self, file, protocol=None, ..., buffer_callback=None): """ If *buffer_callback* is not None, then it is called with a list of out-of-band buffer views when deemed necessary (this could be once every buffer, or only after a certain size is reached, or once at the end, depending on implementation details). The callback should arrange to store or transmit those buffers without changing their order. If *buffer_callback* is None (the default), buffer views are serialized into *file* as part of the pickle stream. It is an error if *buffer_callback* is not None and *protocol* is None or smaller than 5. """ def pickle.dumps(obj, protocol=None, *, ..., buffer_callback=None): """ See above for *buffer_callback*. """ ``pickle.Unpickler.__init__`` and ``pickle.loads`` are augmented with an additional ``buffers`` parameter:: class Unpickler: def __init__(file, *, ..., buffers=None): """ If *buffers* is not None, it should be an iterable of buffer-enabled objects that is consumed each time the pickle stream references an out-of-band buffer view. Such buffers have been given in order to the *buffer_callback* of a Pickler object. If *buffers* is None (the default), then the buffers are taken from the pickle stream, assuming they are serialized there. It is an error for *buffers* to be None if the pickle stream was produced with a non-None *buffer_callback*. """ def pickle.loads(data, *, ..., buffers=None): """ See above for *buffers*. """ Protocol changes ================ Three new opcodes are introduced: * ``BYTEARRAY`` creates a bytearray from the data following it in the pickle stream and pushes it on the stack (just like ``BINBYTES8`` does for bytes objects); * ``NEXT_BUFFER`` fetches a buffer from the ``buffers`` iterable and pushes it on the stack. * ``READONLY_BUFFER`` makes a readonly view of the top of the stack. When pickling encounters a ``PickleBuffer``, there can be four cases: * If a ``buffer_callback`` is given and the ``PickleBuffer`` is writable, the ``PickleBuffer`` is given to the callback and a ``NEXT_BUFFER`` opcode is appended to the pickle stream. * If a ``buffer_callback`` is given and the ``PickleBuffer`` is readonly, the ``PickleBuffer`` is given to the callback and a ``NEXT_BUFFER`` opcode is appended to the pickle stream, followed by a ``READONLY_BUFFER`` opcode. * If no ``buffer_callback`` is given and the ``PickleBuffer`` is writable, it is serialized into the pickle stream as if it were a ``bytearray`` object. * If no ``buffer_callback`` is given and the ``PickleBuffer`` is readonly, it is serialized into the pickle stream as if it were a ``bytes`` object. The distinction between readonly and writable buffers is explained below (see "Mutability"). Caveats ======= Mutability ---------- PEP 3118 buffers [#pep-3118]_ can be readonly or writable. Some objects, such as Numpy arrays, need to be backed by a mutable buffer for full operation. Pickle consumers that use the ``buffer_callback`` and ``buffers`` arguments will have to be careful to recreate mutable buffers. When doing I/O, this implies using buffer-passing API variants such as ``readinto`` (which are also often preferrable for performance). Data sharing ------------ If you pickle and then unpickle an object in the same process, passing out-of-band buffer views, then the unpickled object may be backed by the same buffer as the original pickled object. For example, it might be reasonable to implement reduction of a Numpy array as follows (crucial metadata such as shapes is omitted for simplicity):: class ndarray: def __reduce_ex__(self, protocol): if protocol == 5: return numpy.frombuffer, (PickleBuffer(self), self.dtype) # Legacy code for earlier protocols omitted Then simply passing the PickleBuffer around from ``dumps`` to ``loads`` will produce a new Numpy array sharing the same underlying memory as the original Numpy object (and, incidentally, keeping it alive):: >>> import numpy as np >>> a = np.zeros(10) >>> a[0] 0.0 >>> buffers = [] >>> data = pickle.dumps(a, protocol=5, buffer_callback=buffers.extend) >>> b = pickle.loads(data, buffers=buffers) >>> b[0] = 42 >>> a[0] 42.0 This won't happen with the traditional ``pickle`` API (i.e. without passing ``buffers`` and ``buffer_callback`` parameters), because then the buffer view is serialized inside the pickle stream with a copy. Alternatives ============ The ``pickle`` persistence interface is a way of storing references to designated objects in the pickle stream while handling their actual serialization out of band. For example, one might consider the following for zero-copy serialization of bytearrays:: class MyPickle(pickle.Pickler): def __init__(self, *args, **kwargs): super().__init__(*args, **kwargs) self.buffers = [] def persistent_id(self, obj): if type(obj) is not bytearray: return None else: index = len(self.buffers) self.buffers.append(obj) return ('bytearray', index) class MyUnpickle(pickle.Unpickler): def __init__(self, *args, buffers, **kwargs): super().__init__(*args, **kwargs) self.buffers = buffers def persistent_load(self, pid): type_tag, index = pid if type_tag == 'bytearray': return self.buffers[index] else: assert 0 # unexpected type This mechanism has two drawbacks: * Each ``pickle`` consumer must reimplement ``Pickler`` and ``Unpickler`` subclasses, with custom code for each type of interest. Essentially, N pickle consumers end up each implementing custom code for M producers. This is difficult (especially for sophisticated types such as Numpy arrays) and poorly scalable. * Each object encountered by the pickle module (even simple built-in objects such as ints and strings) triggers a call to the user's ``persistent_id()`` method, leading to a possible performance drop compared to nominal. Open questions ============== Should ``buffer_callback`` take a single buffers or a sequence of buffers? * Taking a single buffer would allow returning a boolean indicating whether the given buffer is serialized in-band or out-of-band. * Taking a sequence of buffers is potentially more efficient by reducing function call overhead. Related work ============ Dask.distributed implements a custom zero-copy serialization with fallback to pickle [#dask-serialization]_. PyArrow implements zero-copy component-based serialization for a few selected types [#pyarrow-serialization]_. PEP 554 proposes hosting multiple interpreters in a single process, with provisions for transferring buffers between interpreters as a communication scheme [#pep-554]_. Acknowledgements ================ Thanks to the following people for early feedback: Nick Coghlan, Olivier Grisel, Stefan Krah, MinRK, Matt Rocklin, Eric Snow. References ========== .. [#dask] Dask.distributed -- A lightweight library for distributed computing in Python https://distributed.readthedocs.io/ .. [#dask-serialization] Dask.distributed custom serialization https://distributed.readthedocs.io/en/latest/serialization.html .. [#ipyparallel] IPyParallel -- Using IPython for parallel computing https://ipyparallel.readthedocs.io/ .. [#pyarrow] PyArrow -- A cross-language development platform for in-memory data https://arrow.apache.org/docs/python/ .. [#pyarrow-serialization] PyArrow IPC and component-based serialization https://arrow.apache.org/docs/python/ipc.html#component-based-serialization .. [#pep-3118] PEP 3118 -- Revising the buffer protocol https://www.python.org/dev/peps/pep-3118/ .. [#pep-554] PEP 554 -- Multiple Interpreters in the Stdlib https://www.python.org/dev/peps/pep-0554/ Copyright ========= This document has been placed into the public domain. [View Less]

11 18

Python startup time
by Victor Stinner Oct. 10, 2018

Oct. 10, 2018

Hi, On Twitter, Raymond Hettinger wrote: "The decision making process on Python-dev is an anti-pattern, governed by anecdotal data and ambiguity over what problem is solved." https://twitter.com/raymondh/status/887069454693158912 About "anecdotal data", I would like to discuss the Python startup time. == Python 3.7 compared to 2.7 == First of all, on speed.python.org, we have: * Python 2.7: 6.4 ms with site, 3.0 ms without site (-S) * master (3.7): 14.5 ms with site, 8.4 ms without … [View More]site (-S) Python 3.7 startup time is 2.3x slower with site (default mode), or 2.8x slower without site (-S command line option). (I will skip Python 3.4, 3.5 and 3.6 which are much worse than Python 3.7...) So if an user complained about Python 2.7 startup time: be prepared for a 2x - 3x more angry user when "forced" to upgrade to Python 3! == Mercurial vs Git, Python vs C, startup time == Startup time matters a lot for Mercurial since Mercurial is compared to Git. Git and Mercurial have similar features, but Git is written in C whereas Mercurial is written in Python. Quick benchmark on the speed.python.org server: * hg version: 44.6 ms +- 0.2 ms * git --version: 974 us +- 7 us Mercurial startup time is already 45.8x slower than Git whereas tested Mercurial runs on Python 2.7.12. Now try to sell Python 3 to Mercurial developers, with a startup time 2x - 3x slower... I tested Mecurial 3.7.3 and Git 2.7.4 on Ubuntu 16.04.1 using "python3 -m perf command -- ...". == CPython core developers don't care? no, they do care == Christian Heimes, Naoki INADA, Serhiy Storchaka, Yury Selivanov, me (Victor Stinner) and other core developers made multiple changes last years to reduce the number of imports at startup, optimize impotlib, etc. IHMO all these core developers are well aware of the competition of programming languages, and honesty Python startup time isn't "good". So let's compare it to other programming languages similar to Python. == PHP, Ruby, Perl == I measured the startup time of other programming languages which are similar to Python, still on the speed.python.org server using "python3 -m perf command -- ...": * perl -e ' ': 1.18 ms +- 0.01 ms * php -r ' ': 8.57 ms +- 0.05 ms * ruby -e ' ': 32.8 ms +- 0.1 ms Wow, Perl is quite good! PHP seems as good as Python 2 (but Python 3 is worse). Ruby startup time seems less optimized than other languages. Tested versions: * perl 5, version 22, subversion 1 (v5.22.1) * PHP 7.0.18-0ubuntu0.16.04.1 (cli) ( NTS ) * ruby 2.3.1p112 (2016-04-26) [x86_64-linux-gnu] == Quick Google search == I also searched for "python startup time" and "python slow startup time" on Google and found many articles. Some examples: "Reducing the Python startup time" http://www.draketo.de/book/export/html/498 => "The python startup time always nagged me (17-30ms) and I just searched again for a way to reduce it, when I found this: The Python-Launcher caches GTK imports and forks new processes to reduce the startup time of python GUI programs." https://nelsonslog.wordpress.com/2013/04/08/python-startup-time/ => "Wow, Python startup time is worse than I thought." "How to speed up python starting up and/or reduce file search while loading libraries?" https://stackoverflow.com/questions/15474160/how-to-speed-up-python-startin… => "The first time I log to the system and start one command it takes 6 seconds just to show a few line of help. If I immediately issue the same command again it takes 0.1s. After a couple of minutes it gets back to 6s. (proof of short-lived cache)" "How does one optimise the startup of a Python script/program?" https://www.quora.com/How-does-one-optimise-the-startup-of-a-Python-script-… => "I wrote a Python program that would be used very often (imagine 'cd' or 'ls') for very short runtimes, how would I make it start up as fast as possible?" "Python Interpreter Startup time" https://bytes.com/topic/python/answers/34469-pyhton-interpreter-startup-time "Python is very slow to start on Windows 7" https://stackoverflow.com/questions/29997274/python-is-very-slow-to-start-o… => "Python takes 17 times longer to load on my Windows 7 machine than Ubuntu 14.04 running on a VM" => "returns in 0.614s on Windows and 0.036s on Linux" "How to make a fast command line tool in Python" (old article Python 2.5.2) https://files.bemusement.org/talks/OSDC2008-FastPython/ => "(...) some techniques Bazaar uses to start quickly, such as lazy imports." -- So please continue efforts for make Python startup even faster to beat all other programming languages, and finally convince Mercurial to upgrade ;-) Victor [View Less]

42 96

Clarifying Cygwin support in CPython
by Erik Bray July 31, 2018

July 31, 2018

Hi folks, As some people here know I've been working off and on for a while to improve CPython's support of Cygwin. I'm motivated in part by a need to have software working on Python 3.x on Cygwin for the foreseeable future, preferably with minimal graft. (As an incidental side-effect Python's test suite--especially of system-level functionality--serves as an interesting test suite for Cygwin itself too.) This is partly what motivated PEP 539 [1], although that PEP had the advantage of … [View More]benefiting other POSIX-compatible platforms as well (and in fact was fixing an aspect of CPython that made it unfriendly to supporting other platforms). As far as I can tell, the first commit to Python to add any kind of support for Cygwin was made by Guido (committing a contributed patch) back in 1999 [2]. Since then, bits and pieces have been added for Cygwin's benefit over time, with varying degrees of impact in terms of #ifdefs and the like (for the most part Cygwin does not require *much* in the way of special support, but it does have some differences from a "normal" POSIX-compliant platform, such as the possibility for case-insensitive filesystems and executables that end in .exe). I don't know whether it's ever been "officially supported" but someone with a longer memory of the project can comment on that. I'm not sure if it was discussed at all or not in the context of PEP 11. I have personally put in a fair amount of effort already in either fixing issues on Cygwin (many of these issues also impact MinGW), or more often than not fixing issues in the CPython test suite on Cygwin--these are mostly tests that are broken due to invalid assumptions about the platform (for example, that there is always a "root" user with uid=0; this is not the case on Cygwin). In other cases some tests need to be skipped or worked around due to platform-specific bugs, and Cygwin is hardly the only case of this in the test suite. I also have an experimental AppVeyor configuration for running the tests on Cygwin [3], as well as an experimental buildbot (not available on the internet, but working). These currently rely on a custom branch that includes fixes needed for the test suite to run to completion without crashing or hanging (e.g. https://bugs.python.org/issue31885). It would be nice to add this as an official buildbot, but I'm not sure if it makes sense to do that until it's "green", or at least not crashing. I have several other patches to the tests toward this goal, and am currently down to ~22 tests failing. Before I do any more work on this, however, it would be best to once and for all clarify the support for Cygwin in CPython, as it has never been "officially supported" nor unsupported--this way we can avoid having this discussion every time a patch related to Cygwin comes up. I could provide some arguments for why I believe Cygwin should supported, but before this gets too long I'd just like to float the idea of having the discussion in the first place. It's also not exactly clear to me how to meet the standards in PEP 11 for supporting a platform--in particular it's not clear when a buildbot is considered "stable", or how to achieve that without getting necessary fixes merged into the main branch in the first place. Thanks, Erik [1] https://www.python.org/dev/peps/pep-0539/ [2] https://github.com/python/cpython/commit/717d1fdf2acbef5e6b47d9b4dcf48ef182… [3] https://ci.appveyor.com/project/embray/cpython [View Less]

3 6

PEP 572: Write vs Read, Understand and Control Flow
by Victor Stinner July 4, 2018

July 4, 2018

Hi, I have been asked to express myself on the PEP 572. I'm not sure that it's useful, but here is my personal opinion on the proposed "assignment expressions". PEP 572 -- Assignment Expressions: https://www.python.org/dev/peps/pep-0572/ First of all, I concur with others: Chris Angelico did a great job to design a good and full PEP, and a working implementation which is also useful to play with it! WARNING! I was (strongly) opposed to PEP 448 Unpacking Generalizations (ex: [1, 2, *list]) … [View More]and PEP 498 f-string (f"Hello {name}"), whereas I am now a happy user of these new syntaxes. So I'm not sure that I have good tastes :-) Tim Peter gaves the following example. "LONG" version: diff = x - x_base if diff: g = gcd(diff, n) if g > 1: return g versus the "SHORT" version: if (diff := x - x_base) and (g := gcd(diff, n)) > 1: return g == Write == If your job is to write code: the SHORT version can be preferred since it's closer to what you have in mind and the code is shorter. When you read your own code, it seems straightforward and you like to see everything on the same line. The LONG version looks like your expressiveness is limited by the computer. It's like having to use simple words when you talk to a child, because a child is unable to understand more subtle and advanced sentences. You want to write beautiful code for adults, right? == Read and Understand == In my professional experience, I spent most of my time on reading code, rather than writing code. By reading, I mean: try to understand why this specific bug that cannot occur... is always reproduced by the customer, whereas we fail to reproduce it in our test lab :-) This bug is impossible, you know it, right? So let's say that you never read the example before, and it has a bug. By "reading the code", I really mean understanding here. In your opinion, which version is easier to *understand*, without actually running the code? IMHO the LONG version is simpler to understand, since the code is straightforward, it's easy to "guess" the *control flow* (guess in which order instructions will be executed). Print the code on paper and try to draw lines to follow the control flow. It may be easier to understand how SHORT is more complex to understand than LONG. == Debug == Now let's imagine that you can run the code (someone succeeded to reproduce the bug in the test lab!). Since it has a bug, you now likely want to try to understand why the bug occurs using a debugger. Sadly, most debugger are designed as if a single line of code can only execute a single instruction. I tried pdb: you cannot only run (diff := x - x_base) and then get "diff" value, before running the second assingment, you can only execute the *full line* at once. I would say that the LONG version is easier to debug, at least using pdb. I'm using regularly gdb which implements the "step" command as I expect (don't execute the full line, execute sub expressions one by one), but it's still harder to follow the control flow when a single line contains multiple instructions, than debugging lines with a single instruction. You can see it as a limitation of pdb, but many tools only have the granularity of whole line. Think about tracebacks. If you get an exception at "line 1" in the SHORT example (the long "if" expression), what can you deduce from the line number? What happened? If you get an exception in the LONG example, the line number gives you a little bit more information... maybe just enough to understand the bug? Example showing the pdb limitation: >>> def f(): ... breakpoint() ... if (x:=1) and (y:=2): pass ... >>> f() > <stdin>(3)f() (Pdb) p x *** NameError: name 'x' is not defined (Pdb) p y *** NameError: name 'y' is not defined (Pdb) step --Return-- > <stdin>(3)f()->None (Pdb) p x 1 (Pdb) p y 2 ... oh, pdb gone too far. I expected a break after "x := 1" and before "y := 2" :-( == Write code for babies! == Please don't write code for yourself, but write code for babies! :-) These babies are going to maintain your code for the next 5 years, while you moved to a different team or project in the meanwhile. Be kind with your coworkers and juniors! I'm trying to write a single instruction per line whenever possible, even if the used language allows me much more complex expressions. Even if the C language allows assignments in if, I avoid them, because I regularly have to debug my own code in gdb ;-) Now the question is which Python are allowed for babies. I recall that a colleague was surprised and confused by context managers. Does it mean that try/finally should be preferred? What about f'Hello {name.title()}' which calls a method into a "string" (formatting)? Or metaclasses? I guess that the limit should depend on your team, and may be explained in the coding style designed by your whole team? Victor [View Less]

12 15

Re: [Python-Dev] PEP 572: Write vs Read, Understand and Control Flow
by Tim Peters July 4, 2018

July 4, 2018

[Victor Stinner] ... > Tim Peter gaves the following example. "LONG" version: > > diff = x - x_base > if diff: > g = gcd(diff, n) > if g > 1: > return g > > versus the "SHORT" version: > > if (diff := x - x_base) and (g := gcd(diff, n)) > 1: > return g > > == Write == > > If your job is to write code: the SHORT version can be preferred since > it's closer to what you have in mind and the code is shorter. When you > … [View More]read your own code, it seems straightforward and you like to see > everything on the same line. All so, but a bit more: in context, this is just one block in a complex algorithm. The amount of _vertical_ screen space it consumes directly affects how much of what comes before and after it can be seen without scrolling. Understanding this one block in isolation is approximately useless unless you can also see how it fits into the whole. Saving 3 lines of 5 is substantial, but it's more often saving 1 of 5 or 6. Regardless, they add up. > The LONG version looks like your expressiveness is limited by the > computer. It's like having to use simple words when you talk to a > child, because a child is unable to understand more subtle and > advanced sentences. You want to write beautiful code for adults, > right? I want _the whole_ to be as transparent as possible. That's a complicated balancing act in practice. > == Read and Understand == > > In my professional experience, I spent most of my time on reading > code, rather than writing code. By reading, I mean: try to understand > why this specific bug that cannot occur... is always reproduced by the > customer, whereas we fail to reproduce it in our test lab :-) This bug > is impossible, you know it, right? > > So let's say that you never read the example before, and it has a bug. Then you're screwed - pay me to fix it ;-) Seriously, as above, this block on its own is senseless without understanding both the mathematics behind what it's doing, and on how all the code before it picked `x` and `x_base` to begin with. > By "reading the code", I really mean understanding here. In your > opinion, which version is easier to *understand*, without actually > running the code? Honestly, I find the shorter version a bit easier to understand: fewer indentation levels, and less semantically empty repetition of names. > IMHO the LONG version is simpler to understand, since the code is > straightforward, it's easy to "guess" the *control flow* (guess in > which order instructions will be executed). You're saying you don't know that in "x and y" Python evaluates x first, and only evaluates y if x "is truthy"? Sorry, but this seems trivial to me in either spelling. > Print the code on paper and try to draw lines to follow the control > flow. It may be easier to understand how SHORT is more complex to > understand than LONG. Since they're semantically identical, there's _something_ suspect about a conclusion that one is _necessarily_ harder to understand than the other ;-) I don't have a problem with you finding the longer version easier to understand, but I do have a problem if you have a problem with me finding the shorter easier. > == Debug == > > Now let's imagine that you can run the code (someone succeeded to > reproduce the bug in the test lab!). Since it has a bug, you now > likely want to try to understand why the bug occurs using a debugger. > > Sadly, most debugger are designed as if a single line of code can only > execute a single instruction. I tried pdb: you cannot only run (diff > := x - x_base) and then get "diff" value, before running the second > assingment, you can only execute the *full line* at once. > > I would say that the LONG version is easier to debug, at least using pdb. That might be a good reason to avoid, say, list comprehensions (highly complex expressions of just about any kind), but I think this overlooks the primary _point_ of "binding expressions": to give names to intermediate results. I couldn't care less if pdb executes the whole "if" statement in one gulp, because I get exactly the same info either way: the names `diff` and `g` bound to the results of the expressions they named. What actual difference does it make whether pdb binds the names one at a time, or both, before it returns to the prompt? Binding expressions are debugger-friendly in that they _don't_ just vanish without a trace. It's their purpose to _capture_ the values of the expressions they name. Indeed, you may want to add them all over the place inside expressions, never intending to use the names, just so that you can see otherwise-ephemeral intra-expression results in your debugger ;-) > ... Think about tracebacks. If you get an xception at "line 1" in the > SHORT example (the long "if" expression), what can you deduce > from the line number? What happened? > > If you get an exception in the LONG example, the line number gives you > a little bit more information... maybe just enough to understand the > bug? This one I wholly agree with, in general. In the specific example at hand, it's weak, because there's so little that _could_ raise an exception. For example, if the variables weren't bound to integers, in context the code would have blown up long before reaching this block. Python ints are unbounded, so overflow in "-" or "gcd" aren't possible either. MemoryError is theoretically possible, and in that case it would be good to know whether it happened during "-" or during "gcd()". Good to know, but not really helpful, because either way you ran out of memory :-( > == Write code for babies! == > > Please don't write code for yourself, but write code for babies! :-) > > These babies are going to maintain your code for the next 5 years, > while you moved to a different team or project in the meanwhile. Be > kind with your coworkers and juniors! > > I'm trying to write a single instruction per line whenever possible, > even if the used language allows me much more complex expressions. > Even if the C language allows assignments in if, I avoid them, because > I regularly have to debug my own code in gdb ;-) > > Now the question is which Python are allowed for babies. I recall that > a colleague was surprised and confused by context managers. Does it > mean that try/finally should be preferred? What about f'Hello > {name.title()}' which calls a method into a "string" (formatting)? Or > metaclasses? I guess that the limit should depend on your team, and > may be explained in the coding style designed by your whole team? It's the kind of thing I prefer to leave to team style guides, because consensus will never be reached. In a different recent thread, someone complained about using functions at all, because their names are never wholly accurate, and in any case they hide what's "really" going on. To my eyes, that was an unreasonably extreme "write code for babies" position. If a style guide banned using "and" or "or" in Python "if" or "while" tests, I'd find that less extreme, but also unreasonable. But if a style guide banned functions with more than 50 formal arguments, I'd find that unreasonably tolerant. Luckily, I only have to write code for me now, so am free to pick the perfect compromise in every case ;-) [View Less]

11 17

(Looking for) A Retrospective on the Move to Python 3
by Eric Snow May 17, 2018

May 17, 2018

In pondering our approach to future Python major releases, I found myself considering the experience we've had with Python 3. The whole Py3k effort predates my involvement in the community so I missed a bunch of context about the motivations, decisions, and challenges. While I've pieced some of that together over the years now since I've been around, I've certainly seen much of the aftermath. For me, at least, it would be helpful to have a bit more insight into the history. :) With that in … [View More]mind, it would be worth having an informational PEP with an authoritative retrospective on the lessons learned from the Python 3 effort (and transition). Consider it a sort of autobiography, "memoirs on the python-dev change to Python 3". :) At this point the transition has settled in enough that we should be able to present a relatively objective (and consistent) view, while we're not so far removed that we've forgotten anything important. :) If such a document already exists then I'd love a pointer to it. The document would benefit (among others): * python-dev (by giving us a clear viewpoint to inform decisions about future releases) * new-comers to Python that want more insight into the language * folks transitioning from 2 to 3 * communities that have (or think they have) problems similar to those we faced in Python 2 The PEP doesn't even have to be done all at once, nor by one person. In fact, there are many viewpoints that would add value to the document. Hence it would probably make sense to encourage broad participation and then have a single editor to effect a single voice in the document. The contents of the retrospective document should probably cover a broad range of topics, since there's so much to learn from the move to Python 3. To give an indication of what I mean, I've included a rough outline at the bottom of this message. So...I typically strongly avoid making proposals that I'm not willing to execute. However, in this case I simply do not have enough experience in the history to feel comfortable doing a good job of it in a reasonable amount of time (which matters due to the tendency of valuable info to fade away). :/ I have no expectation that someone will pick this up, though I do hope since the benefit would be significant. My apologies in advance if this wasted anyone's time. -eric ++++++++++++++++++++++++++++++++ I'd hope to see something along the lines of (at least) the following, in rough order: * a concise summary of the document at the top (very meta, I know :) ) + what were we solving? + what was the solution? + why do it that way? + what went right? + what went wrong? + impact on the community + impact on core dev contribution * timeline * key players (and level of involvement) + old guard core devs + new guard + folks brought on for Py3k (e.g. IIRC a swarm of Googlers dove in) + non-core-devs * motivations * expectations (e.g. time frames, community reaction) * corresponding results * a summary of what we did * alternative approaches * what went right (and was it on purpose :) ) * what went wrong (e.g. io) and why * how the Py3k project differed from normal python-dev workflow (e.g. pace, decision-making, communications) * lasting impact of python-dev * key things that would have been better if done differently * key decisions/planning (mostly a priori to the release work) + scope of backward compatibility + process (using PEPs with PEPs 30xx guiding) + schedule + specific changes (i.e. PEPs 31xx) + what was left out (and why) + plans to help library and app authors transition (e.g. 2to3) + feature/schedule overlap with Python 2 (i.e. 2.6 and 2.7) + the language moratorium * things that got missed and why + unicode/bytes in some stdlib modules (and builtins?) * things that were overdone (and how that got missed) + unicode/bytes in some stdlib modules (and builtins?) * (last but not least) challenges faced by folks working to transition their exiting code to Python 3 [View Less]

20 28