Mailman 3 July 2013 - Python-Dev

PEP 1, PEP Purpose and Guidelines
by barry＠zope.com 18 May '21

18 May '21

It has been a while since I posted a copy of PEP 1 to the mailing lists and newsgroups. I've recently done some updating of a few sections, so in the interest of gaining wider community participation in the Python development process, I'm posting the latest revision of PEP 1 here. A version of the PEP is always available on-line at http://www.python.org/peps/pep-0001.html Enjoy, -Barry -------------------- snip snip -------------------- PEP: 1 Title: PEP Purpose and Guidelines Version: $Revision: 1.36 $ Last-Modified: $Date: 2002/07/29 18:34:59 $ Author: Barry A. Warsaw, Jeremy Hylton Status: Active Type: Informational Created: 13-Jun-2000 Post-History: 21-Mar-2001, 29-Jul-2002 What is a PEP? PEP stands for Python Enhancement Proposal. A PEP is a design document providing information to the Python community, or describing a new feature for Python. The PEP should provide a concise technical specification of the feature and a rationale for the feature. We intend PEPs to be the primary mechanisms for proposing new features, for collecting community input on an issue, and for documenting the design decisions that have gone into Python. The PEP author is responsible for building consensus within the community and documenting dissenting opinions. Because the PEPs are maintained as plain text files under CVS control, their revision history is the historical record of the feature proposal[1]. Kinds of PEPs There are two kinds of PEPs. A standards track PEP describes a new feature or implementation for Python. An informational PEP describes a Python design issue, or provides general guidelines or information to the Python community, but does not propose a new feature. Informational PEPs do not necessarily represent a Python community consensus or recommendation, so users and implementors are free to ignore informational PEPs or follow their advice. PEP Work Flow The PEP editor, Barry Warsaw <peps(a)python.org>, assigns numbers for each PEP and changes its status. The PEP process begins with a new idea for Python. It is highly recommended that a single PEP contain a single key proposal or new idea. The more focussed the PEP, the more successfully it tends to be. The PEP editor reserves the right to reject PEP proposals if they appear too unfocussed or too broad. If in doubt, split your PEP into several well-focussed ones. Each PEP must have a champion -- someone who writes the PEP using the style and format described below, shepherds the discussions in the appropriate forums, and attempts to build community consensus around the idea. The PEP champion (a.k.a. Author) should first attempt to ascertain whether the idea is PEP-able. Small enhancements or patches often don't need a PEP and can be injected into the Python development work flow with a patch submission to the SourceForge patch manager[2] or feature request tracker[3]. The PEP champion then emails the PEP editor <peps(a)python.org> with a proposed title and a rough, but fleshed out, draft of the PEP. This draft must be written in PEP style as described below. If the PEP editor approves, he will assign the PEP a number, label it as standards track or informational, give it status 'draft', and create and check-in the initial draft of the PEP. The PEP editor will not unreasonably deny a PEP. Reasons for denying PEP status include duplication of effort, being technically unsound, not providing proper motivation or addressing backwards compatibility, or not in keeping with the Python philosophy. The BDFL (Benevolent Dictator for Life, Guido van Rossum) can be consulted during the approval phase, and is the final arbitrator of the draft's PEP-ability. If a pre-PEP is rejected, the author may elect to take the pre-PEP to the comp.lang.python newsgroup (a.k.a. python-list(a)python.org mailing list) to help flesh it out, gain feedback and consensus from the community at large, and improve the PEP for re-submission. The author of the PEP is then responsible for posting the PEP to the community forums, and marshaling community support for it. As updates are necessary, the PEP author can check in new versions if they have CVS commit permissions, or can email new PEP versions to the PEP editor for committing. Standards track PEPs consists of two parts, a design document and a reference implementation. The PEP should be reviewed and accepted before a reference implementation is begun, unless a reference implementation will aid people in studying the PEP. Standards Track PEPs must include an implementation - in the form of code, patch, or URL to same - before it can be considered Final. PEP authors are responsible for collecting community feedback on a PEP before submitting it for review. A PEP that has not been discussed on python-list(a)python.org and/or python-dev(a)python.org will not be accepted. However, wherever possible, long open-ended discussions on public mailing lists should be avoided. Strategies to keep the discussions efficient include, setting up a separate SIG mailing list for the topic, having the PEP author accept private comments in the early design phases, etc. PEP authors should use their discretion here. Once the authors have completed a PEP, they must inform the PEP editor that it is ready for review. PEPs are reviewed by the BDFL and his chosen consultants, who may accept or reject a PEP or send it back to the author(s) for revision. Once a PEP has been accepted, the reference implementation must be completed. When the reference implementation is complete and accepted by the BDFL, the status will be changed to `Final.' A PEP can also be assigned status `Deferred.' The PEP author or editor can assign the PEP this status when no progress is being made on the PEP. Once a PEP is deferred, the PEP editor can re-assign it to draft status. A PEP can also be `Rejected'. Perhaps after all is said and done it was not a good idea. It is still important to have a record of this fact. PEPs can also be replaced by a different PEP, rendering the original obsolete. This is intended for Informational PEPs, where version 2 of an API can replace version 1. PEP work flow is as follows: Draft -> Accepted -> Final -> Replaced ^ +----> Rejected v Deferred Some informational PEPs may also have a status of `Active' if they are never meant to be completed. E.g. PEP 1. What belongs in a successful PEP? Each PEP should have the following parts: 1. Preamble -- RFC822 style headers containing meta-data about the PEP, including the PEP number, a short descriptive title (limited to a maximum of 44 characters), the names, and optionally the contact info for each author, etc. 2. Abstract -- a short (~200 word) description of the technical issue being addressed. 3. Copyright/public domain -- Each PEP must either be explicitly labelled as placed in the public domain (see this PEP as an example) or licensed under the Open Publication License[4]. 4. Specification -- The technical specification should describe the syntax and semantics of any new language feature. The specification should be detailed enough to allow competing, interoperable implementations for any of the current Python platforms (CPython, JPython, Python .NET). 5. Motivation -- The motivation is critical for PEPs that want to change the Python language. It should clearly explain why the existing language specification is inadequate to address the problem that the PEP solves. PEP submissions without sufficient motivation may be rejected outright. 6. Rationale -- The rationale fleshes out the specification by describing what motivated the design and why particular design decisions were made. It should describe alternate designs that were considered and related work, e.g. how the feature is supported in other languages. The rationale should provide evidence of consensus within the community and discuss important objections or concerns raised during discussion. 7. Backwards Compatibility -- All PEPs that introduce backwards incompatibilities must include a section describing these incompatibilities and their severity. The PEP must explain how the author proposes to deal with these incompatibilities. PEP submissions without a sufficient backwards compatibility treatise may be rejected outright. 8. Reference Implementation -- The reference implementation must be completed before any PEP is given status 'Final,' but it need not be completed before the PEP is accepted. It is better to finish the specification and rationale first and reach consensus on it before writing code. The final implementation must include test code and documentation appropriate for either the Python language reference or the standard library reference. PEP Template PEPs are written in plain ASCII text, and should adhere to a rigid style. There is a Python script that parses this style and converts the plain text PEP to HTML for viewing on the web[5]. PEP 9 contains a boilerplate[7] template you can use to get started writing your PEP. Each PEP must begin with an RFC822 style header preamble. The headers must appear in the following order. Headers marked with `*' are optional and are described below. All other headers are required. PEP: <pep number> Title: <pep title> Version: <cvs version string> Last-Modified: <cvs date string> Author: <list of authors' real names and optionally, email addrs> * Discussions-To: <email address> Status: <Draft | Active | Accepted | Deferred | Final | Replaced> Type: <Informational | Standards Track> * Requires: <pep numbers> Created: <date created on, in dd-mmm-yyyy format> * Python-Version: <version number> Post-History: <dates of postings to python-list and python-dev> * Replaces: <pep number> * Replaced-By: <pep number> The Author: header lists the names and optionally, the email addresses of all the authors/owners of the PEP. The format of the author entry should be address(a)dom.ain (Random J. User) if the email address is included, and just Random J. User if the address is not given. If there are multiple authors, each should be on a separate line following RFC 822 continuation line conventions. Note that personal email addresses in PEPs will be obscured as a defense against spam harvesters. Standards track PEPs must have a Python-Version: header which indicates the version of Python that the feature will be released with. Informational PEPs do not need a Python-Version: header. While a PEP is in private discussions (usually during the initial Draft phase), a Discussions-To: header will indicate the mailing list or URL where the PEP is being discussed. No Discussions-To: header is necessary if the PEP is being discussed privately with the author, or on the python-list or python-dev email mailing lists. Note that email addresses in the Discussions-To: header will not be obscured. Created: records the date that the PEP was assigned a number, while Post-History: is used to record the dates of when new versions of the PEP are posted to python-list and/or python-dev. Both headers should be in dd-mmm-yyyy format, e.g. 14-Aug-2001. PEPs may have a Requires: header, indicating the PEP numbers that this PEP depends on. PEPs may also have a Replaced-By: header indicating that a PEP has been rendered obsolete by a later document; the value is the number of the PEP that replaces the current document. The newer PEP must have a Replaces: header containing the number of the PEP that it rendered obsolete. PEP Formatting Requirements PEP headings must begin in column zero and the initial letter of each word must be capitalized as in book titles. Acronyms should be in all capitals. The body of each section must be indented 4 spaces. Code samples inside body sections should be indented a further 4 spaces, and other indentation can be used as required to make the text readable. You must use two blank lines between the last line of a section's body and the next section heading. You must adhere to the Emacs convention of adding two spaces at the end of every sentence. You should fill your paragraphs to column 70, but under no circumstances should your lines extend past column 79. If your code samples spill over column 79, you should rewrite them. Tab characters must never appear in the document at all. A PEP should include the standard Emacs stanza included by example at the bottom of this PEP. A PEP must contain a Copyright section, and it is strongly recommended to put the PEP in the public domain. When referencing an external web page in the body of a PEP, you should include the title of the page in the text, with a footnote reference to the URL. Do not include the URL in the body text of the PEP. E.g. Refer to the Python Language web site [1] for more details. ... [1] http://www.python.org When referring to another PEP, include the PEP number in the body text, such as "PEP 1". The title may optionally appear. Add a footnote reference that includes the PEP's title and author. It may optionally include the explicit URL on a separate line, but only in the References section. Note that the pep2html.py script will calculate URLs automatically, e.g.: ... Refer to PEP 1 [7] for more information about PEP style ... References [7] PEP 1, PEP Purpose and Guidelines, Warsaw, Hylton http://www.python.org/peps/pep-0001.html If you decide to provide an explicit URL for a PEP, please use this as the URL template: http://www.python.org/peps/pep-xxxx.html PEP numbers in URLs must be padded with zeros from the left, so as to be exactly 4 characters wide, however PEP numbers in text are never padded. Reporting PEP Bugs, or Submitting PEP Updates How you report a bug, or submit a PEP update depends on several factors, such as the maturity of the PEP, the preferences of the PEP author, and the nature of your comments. For the early draft stages of the PEP, it's probably best to send your comments and changes directly to the PEP author. For more mature, or finished PEPs you may want to submit corrections to the SourceForge bug manager[6] or better yet, the SourceForge patch manager[2] so that your changes don't get lost. If the PEP author is a SF developer, assign the bug/patch to him, otherwise assign it to the PEP editor. When in doubt about where to send your changes, please check first with the PEP author and/or PEP editor. PEP authors who are also SF committers, can update the PEPs themselves by using "cvs commit" to commit their changes. Remember to also push the formatted PEP text out to the web by doing the following: % python pep2html.py -i NUM where NUM is the number of the PEP you want to push out. See % python pep2html.py --help for details. Transferring PEP Ownership It occasionally becomes necessary to transfer ownership of PEPs to a new champion. In general, we'd like to retain the original author as a co-author of the transferred PEP, but that's really up to the original author. A good reason to transfer ownership is because the original author no longer has the time or interest in updating it or following through with the PEP process, or has fallen off the face of the 'net (i.e. is unreachable or not responding to email). A bad reason to transfer ownership is because you don't agree with the direction of the PEP. We try to build consensus around a PEP, but if that's not possible, you can always submit a competing PEP. If you are interested assuming ownership of a PEP, send a message asking to take over, addressed to both the original author and the PEP editor <peps(a)python.org>. If the original author doesn't respond to email in a timely manner, the PEP editor will make a unilateral decision (it's not like such decisions can be reversed. :). References and Footnotes [1] This historical record is available by the normal CVS commands for retrieving older revisions. For those without direct access to the CVS tree, you can browse the current and past PEP revisions via the SourceForge web site at http://cvs.sourceforge.net/cgi-bin/cvsweb.cgi/python/nondist/peps/?cvsroot=… [2] http://sourceforge.net/tracker/?group_id=5470&atid=305470 [3] http://sourceforge.net/tracker/?atid=355470&group_id=5470&func=browse [4] http://www.opencontent.org/openpub/ [5] The script referred to here is pep2html.py, which lives in the same directory in the CVS tree as the PEPs themselves. Try "pep2html.py --help" for details. The URL for viewing PEPs on the web is http://www.python.org/peps/ [6] http://sourceforge.net/tracker/?group_id=5470&atid=305470 [7] PEP 9, Sample PEP Template http://www.python.org/peps/pep-0009.html Copyright This document has been placed in the public domain. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 End:

8 14

More compact dictionaries with faster iteration
by Raymond Hettinger 03 Jan '15

03 Jan '15

The current memory layout for dictionaries is unnecessarily inefficient. It has a sparse table of 24-byte entries containing the hash value, key pointer, and value pointer. Instead, the 24-byte entries should be stored in a dense table referenced by a sparse table of indices. For example, the dictionary: d = {'timmy': 'red', 'barry': 'green', 'guido': 'blue'} is currently stored as: entries = [['--', '--', '--'], [-8522787127447073495, 'barry', 'green'], ['--', '--', '--'], ['--', '--', '--'], ['--', '--', '--'], [-9092791511155847987, 'timmy', 'red'], ['--', '--', '--'], [-6480567542315338377, 'guido', 'blue']] Instead, the data should be organized as follows: indices = [None, 1, None, None, None, 0, None, 2] entries = [[-9092791511155847987, 'timmy', 'red'], [-8522787127447073495, 'barry', 'green'], [-6480567542315338377, 'guido', 'blue']] Only the data layout needs to change. The hash table algorithms would stay the same. All of the current optimizations would be kept, including key-sharing dicts and custom lookup functions for string-only dicts. There is no change to the hash functions, the table search order, or collision statistics. The memory savings are significant (from 30% to 95% compression depending on the how full the table is). Small dicts (size 0, 1, or 2) get the most benefit. For a sparse table of size t with n entries, the sizes are: curr_size = 24 * t new_size = 24 * n + sizeof(index) * t In the above timmy/barry/guido example, the current size is 192 bytes (eight 24-byte entries) and the new size is 80 bytes (three 24-byte entries plus eight 1-byte indices). That gives 58% compression. Note, the sizeof(index) can be as small as a single byte for small dicts, two bytes for bigger dicts and up to sizeof(Py_ssize_t) for huge dict. In addition to space savings, the new memory layout makes iteration faster. Currently, keys(), values, and items() loop over the sparse table, skipping-over free slots in the hash table. Now, keys/values/items can loop directly over the dense table, using fewer memory accesses. Another benefit is that resizing is faster and touches fewer pieces of memory. Currently, every hash/key/value entry is moved or copied during a resize. In the new layout, only the indices are updated. For the most part, the hash/key/value entries never move (except for an occasional swap to fill a hole left by a deletion). With the reduced memory footprint, we can also expect better cache utilization. For those wanting to experiment with the design, there is a pure Python proof-of-concept here: http://code.activestate.com/recipes/578375 YMMV: Keep in mind that the above size statics assume a build with 64-bit Py_ssize_t and 64-bit pointers. The space savings percentages are a bit different on other builds. Also, note that in many applications, the size of the data dominates the size of the container (i.e. the weight of a bucket of water is mostly the water, not the bucket). Raymond

27 70

Reviving restricted mode?
by Guido van Rossum 14 Aug '14

14 Aug '14

I've received some enthusiastic emails from someone who wants to revive restricted mode. He started out with a bunch of patches to the CPython runtime using ctypes, which he attached to an App Engine bug: http://code.google.com/p/googleappengine/issues/detail?id=671 Based on his code (the file secure.py is all you need, included in secure.tar.gz) it seems he believes the only security leaks are __subclasses__, gi_frame and gi_code. (I have since convinced him that if we add "restricted" guards to these attributes, he doesn't need the functions added to sys.) I don't recall the exploits that Samuele once posted that caused the death of rexec.py -- does anyone recall, or have a pointer to the threads? -- --Guido van Rossum (home page: http://www.python.org/~guido/)

19 35

CLA link from bugs.python.org
by Tim Delaney 16 Feb '14

16 Feb '14

It appears there's no obvious link from bugs.python.org to the contributor agreement - you need to go via the unintuitive link Foundation -> Contribution Forms (and from what I've read, you're prompted when you add a patch to the tracker). I'd suggest that if the "Contributor Form Received" field is "No" in user details, there be a link to http://www.python.org/psf/contrib/. Tim Delaney

3 4

cffi in stdlib
by Maciej Fijalkowski 19 Dec '13

19 Dec '13

Hello. I would like to discuss on the language summit a potential inclusion of cffi[1] into stdlib. This is a project Armin Rigo has been working for a while, with some input from other developers. It seems that the main reason why people would prefer ctypes over cffi these days is "because it's included in stdlib", which is not generally the reason I would like to hear. Our calls to not use C extensions and to use an FFI instead has seen very limited success with ctypes and quite a lot more since cffi got released. The API is fairly stable right now with minor changes going in and it'll definitely stablize until Python 3.4 release. Notable projects using it: * pypycore - gevent main loop ported to cffi * pgsql2cffi * sdl-cffi bindings * tls-cffi bindings * lxml-cffi port * pyzmq * cairo-cffi * a bunch of others So relatively a lot given that the project is not even a year old (it got 0.1 release in June). As per documentation, the advantages over ctypes: * The goal is to call C code from Python. You should be able to do so without learning a 3rd language: every alternative requires you to learn their own language (Cython, SWIG) or API (ctypes). So we tried to assume that you know Python and C and minimize the extra bits of API that you need to learn. * Keep all the Python-related logic in Python so that you don’t need to write much C code (unlike CPython native C extensions). * Work either at the level of the ABI (Application Binary Interface) or the API (Application Programming Interface). Usually, C libraries have a specified C API but often not an ABI (e.g. they may document a “struct” as having at least these fields, but maybe more). (ctypes works at the ABI level, whereas Cython and native C extensions work at the API level.) * We try to be complete. For now some C99 constructs are not supported, but all C89 should be, including macros (and including macro “abuses”, which you can manually wrap in saner-looking C functions). * We attempt to support both PyPy and CPython, with a reasonable path for other Python implementations like IronPython and Jython. * Note that this project is not about embedding executable C code in Python, unlike Weave. This is about calling existing C libraries from Python. so among other things, making a cffi extension gives you the same level of security as writing C (and unlike ctypes) and brings quite a bit more flexibility (API vs ABI issue) that let's you wrap arbitrary libraries, even those full of macros. Cheers, fijal .. [1]: http://cffi.readthedocs.org/en/release-0.5/

22 75

compile python 3.3 with bz2 support
by Isml 22 Oct '13

22 Oct '13

hi, everyone: I want to compile python 3.3 with bz2 support on RedHat 5.5 but fail to do that. Here is how I do it: 1、download bzip2 and compile it(make、make -f Makefile_libbz2_so、make install) 2、chang to python 3.3 source directory : ./configure --with-bz2=/usr/local/include 3、make 4、make install after installation complete, I test it： [root@localhost Python-3.3.0]# python3 -c "import bz2" Traceback (most recent call last): File "<string>", line 1, in <module> File "/usr/local/lib/python3.3/bz2.py", line 21, in <module> from _bz2 import BZ2Compressor, BZ2Decompressor ImportError: No module named '_bz2' By the way, RedHat 5.5 has a built-in python 2.4.3. Would it be a problem?

3 2

Official github mirror for CPython?
by Eli Bendersky 27 Sep '13

27 Sep '13

Hi all, I've been looking for a Github mirror for Python, and found two: * https://github.com/python-git/python has a lot of forks/watches/starts but seems to be very out of date (last updated 4 years ago) * https://github.com/python-mirror/python doesn't appear to be very popular but is updated daily Are some of you the owners of these repositories? Should we consolidate to a single "semi-official" mirror? Eli

13 19

PEP 447: add type.__locallookup__
by Ronald Oussoren 19 Sep '13

19 Sep '13

Hi, This PEP proposed to add a __locallookup__ slot to type objects, which is used by _PyType_Lookup and super_getattro instead of peeking in the tp_dict of classes. The PEP text explains why this is needed. Differences with the previous version: * Better explanation of why this is a useful addition * type.__locallookup__ is no longer optional. * I've added benchmarking results using pybench. (using the patch attached to issue 18181) Ronald PEP: 447 Title: Add __locallookup__ method to metaclass Version: $Revision$ Last-Modified: $Date$ Author: Ronald Oussoren <ronaldoussoren(a)mac.com> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 12-Jun-2013 Post-History: 2-Jul-2013, 15-Jul-2013, 29-Jul-2013 Abstract ======== Currently ``object.__getattribute__`` and ``super.__getattribute__`` peek in the ``__dict__`` of classes on the MRO for a class when looking for an attribute. This PEP adds an optional ``__locallookup__`` method to a metaclass that can be used to override this behavior. Rationale ========= It is currently not possible to influence how the `super class`_ looks up attributes (that is, ``super.__getattribute__`` unconditionally peeks in the class ``__dict__``), and that can be problematic for dynamic classes that can grow new methods on demand. The ``__locallookup__`` method makes it possible to dynamicly add attributes even when looking them up using the `super class`_. The new method affects ``object.__getattribute__`` (and `PyObject_GenericGetAttr`_) as well for consistency. Background ---------- The current behavior of ``super.__getattribute__`` causes problems for classes that are dynamic proxies for other (non-Python) classes or types, an example of which is `PyObjC`_. PyObjC creates a Python class for every class in the Objective-C runtime, and looks up methods in the Objective-C runtime when they are used. This works fine for normal access, but doesn't work for access with ``super`` objects. Because of this PyObjC currently includes a custom ``super`` that must be used with its classes. The API in this PEP makes it possible to remove the custom ``super`` and simplifies the implementation because the custom lookup behavior can be added in a central location. The superclass attribute lookup hook ==================================== Both ``super.__getattribute__`` and ``object.__getattribute__`` (or `PyObject_GenericGetAttr`_ in C code) walk an object's MRO and peek in the class' ``__dict__`` to look up attributes. A way to affect this lookup is using a method on the meta class for the type, that by default looks up the name in the class ``__dict__``. In Python code -------------- A meta type can define a method ``__locallookup__`` that is called during attribute resolution by both ``super.__getattribute__`` and ``object.__getattribute``:: class MetaType(type): def __locallookup__(cls, name): try: return cls.__dict__[name] except KeyError: raise AttributeError(name) from None The ``__locallookup__`` method has as its arguments a class and the name of the attribute that is looked up. It should return the value of the attribute without invoking descriptors, or raise `AttributeError`_ when the name cannot be found. The `type`_ class provides a default implementation for ``__locallookup__``, that looks up the name in the class dictionary. Example usage ............. The code below implements a silly metaclass that redirects attribute lookup to uppercase versions of names:: class UpperCaseAccess (type): def __locallookup__(cls, name): return cls.__dict__[name.upper()] class SillyObject (metaclass=UpperCaseAccess): def m(self): return 42 def M(self): return "fourtytwo" obj = SillyObject() assert obj.m() == "fortytwo" In C code --------- A new slot ``tp_locallookup`` is added to the ``PyTypeObject`` struct, this slot corresponds to the ``__locallookup__`` method on `type`_. The slot has the following prototype:: PyObject* (*locallookupfunc)(PyTypeObject* cls, PyObject* name); This method should lookup *name* in the namespace of *cls*, without looking at superclasses, and should not invoke descriptors. The method returns ``NULL`` without setting an exception when the *name* cannot be found, and returns a new reference otherwise (not a borrowed reference). Use of this hook by the interpreter ----------------------------------- The new method is required for metatypes and as such is defined on `type_`. Both ``super.__getattribute__`` and ``object.__getattribute__``/`PyObject_GenericGetAttr`_ (through ``_PyType_Lookup``) use the this ``__locallookup__`` method when walking the MRO. Other changes to the implementation ----------------------------------- The change for `PyObject_GenericGetAttr`_ will be done by changing the private function ``_PyType_Lookup``. This currently returns a borrowed reference, but must return a new reference when the ``__locallookup__`` method is present. Because of this ``_PyType_Lookup`` will be renamed to ``_PyType_LookupName``, this will cause compile-time errors for all out-of-tree users of this private API. The attribute lookup cache in ``Objects/typeobject.c`` is disabled for classes that have a metaclass that overrides ``__locallookup__``, because using the cache might not be valid for such classes. Performance impact ------------------ The pybench output below compares an implementation of this PEP with the regular source tree, both based on changeset a5681f50bae2, run on an idle machine an Core i7 processor running Centos 6.4. Even though the machine was idle there were clear differences between runs, I've seen difference in "minimum time" vary from -0.1% to +1.5%, with simular (but slightly smaller) differences in the "average time" difference. .. :: ------------------------------------------------------------------------------- PYBENCH 2.1 ------------------------------------------------------------------------------- * using CPython 3.4.0a0 (default, Jul 29 2013, 13:01:34) [GCC 4.4.7 20120313 (Red Hat 4.4.7-3)] * disabled garbage collection * system check interval set to maximum: 2147483647 * using timer: time.perf_counter * timer: resolution=1e-09, implementation=clock_gettime(CLOCK_MONOTONIC) ------------------------------------------------------------------------------- Benchmark: pep447.pybench ------------------------------------------------------------------------------- Rounds: 10 Warp: 10 Timer: time.perf_counter Machine Details: Platform ID: Linux-2.6.32-358.114.1.openstack.el6.x86_64-x86_64-with-centos-6.4-Final Processor: x86_64 Python: Implementation: CPython Executable: /tmp/default-pep447/bin/python3 Version: 3.4.0a0 Compiler: GCC 4.4.7 20120313 (Red Hat 4.4.7-3) Bits: 64bit Build: Jul 29 2013 14:09:12 (#default) Unicode: UCS4 ------------------------------------------------------------------------------- Comparing with: default.pybench ------------------------------------------------------------------------------- Rounds: 10 Warp: 10 Timer: time.perf_counter Machine Details: Platform ID: Linux-2.6.32-358.114.1.openstack.el6.x86_64-x86_64-with-centos-6.4-Final Processor: x86_64 Python: Implementation: CPython Executable: /tmp/default/bin/python3 Version: 3.4.0a0 Compiler: GCC 4.4.7 20120313 (Red Hat 4.4.7-3) Bits: 64bit Build: Jul 29 2013 13:01:34 (#default) Unicode: UCS4 Test minimum run-time average run-time this other diff this other diff ------------------------------------------------------------------------------- BuiltinFunctionCalls: 45ms 44ms +1.3% 45ms 44ms +1.3% BuiltinMethodLookup: 26ms 27ms -2.4% 27ms 27ms -2.2% CompareFloats: 33ms 34ms -0.7% 33ms 34ms -1.1% CompareFloatsIntegers: 66ms 67ms -0.9% 66ms 67ms -0.8% CompareIntegers: 51ms 50ms +0.9% 51ms 50ms +0.8% CompareInternedStrings: 34ms 33ms +0.4% 34ms 34ms -0.4% CompareLongs: 29ms 29ms -0.1% 29ms 29ms -0.0% CompareStrings: 43ms 44ms -1.8% 44ms 44ms -1.8% ComplexPythonFunctionCalls: 44ms 42ms +3.9% 44ms 42ms +4.1% ConcatStrings: 33ms 33ms -0.4% 33ms 33ms -1.0% CreateInstances: 47ms 48ms -2.9% 47ms 49ms -3.4% CreateNewInstances: 35ms 36ms -2.5% 36ms 36ms -2.5% CreateStringsWithConcat: 69ms 70ms -0.7% 69ms 70ms -0.9% DictCreation: 52ms 50ms +3.1% 52ms 50ms +3.0% DictWithFloatKeys: 40ms 44ms -10.1% 43ms 45ms -5.8% DictWithIntegerKeys: 32ms 36ms -11.2% 35ms 37ms -4.6% DictWithStringKeys: 29ms 34ms -15.7% 35ms 40ms -11.0% ForLoops: 30ms 29ms +2.2% 30ms 29ms +2.2% IfThenElse: 38ms 41ms -6.7% 38ms 41ms -6.9% ListSlicing: 36ms 36ms -0.7% 36ms 37ms -1.3% NestedForLoops: 43ms 45ms -3.1% 43ms 45ms -3.2% NestedListComprehensions: 39ms 40ms -1.7% 39ms 40ms -2.1% NormalClassAttribute: 86ms 82ms +5.1% 86ms 82ms +5.0% NormalInstanceAttribute: 42ms 42ms +0.3% 42ms 42ms +0.0% PythonFunctionCalls: 39ms 38ms +3.5% 39ms 38ms +2.8% PythonMethodCalls: 51ms 49ms +3.0% 51ms 50ms +2.8% Recursion: 67ms 68ms -1.4% 67ms 68ms -1.4% SecondImport: 41ms 36ms +12.5% 41ms 36ms +12.6% SecondPackageImport: 45ms 40ms +13.1% 45ms 40ms +13.2% SecondSubmoduleImport: 92ms 95ms -2.4% 95ms 98ms -3.6% SimpleComplexArithmetic: 28ms 28ms -0.1% 28ms 28ms -0.2% SimpleDictManipulation: 57ms 57ms -1.0% 57ms 58ms -1.0% SimpleFloatArithmetic: 29ms 28ms +4.7% 29ms 28ms +4.9% SimpleIntFloatArithmetic: 37ms 41ms -8.5% 37ms 41ms -8.7% SimpleIntegerArithmetic: 37ms 41ms -9.4% 37ms 42ms -10.2% SimpleListComprehensions: 33ms 33ms -1.9% 33ms 34ms -2.9% SimpleListManipulation: 28ms 30ms -4.3% 29ms 30ms -4.1% SimpleLongArithmetic: 26ms 26ms +0.5% 26ms 26ms +0.5% SmallLists: 40ms 40ms +0.1% 40ms 40ms +0.1% SmallTuples: 46ms 47ms -2.4% 46ms 48ms -3.0% SpecialClassAttribute: 126ms 120ms +4.7% 126ms 121ms +4.4% SpecialInstanceAttribute: 42ms 42ms +0.6% 42ms 42ms +0.8% StringMappings: 94ms 91ms +3.9% 94ms 91ms +3.8% StringPredicates: 48ms 49ms -1.7% 48ms 49ms -2.1% StringSlicing: 45ms 45ms +1.4% 46ms 45ms +1.5% TryExcept: 23ms 22ms +4.9% 23ms 22ms +4.8% TryFinally: 32ms 32ms -0.1% 32ms 32ms +0.1% TryRaiseExcept: 17ms 17ms +0.9% 17ms 17ms +0.5% TupleSlicing: 49ms 48ms +1.1% 49ms 49ms +1.0% WithFinally: 48ms 47ms +2.3% 48ms 47ms +2.4% WithRaiseExcept: 45ms 44ms +0.8% 45ms 45ms +0.5% ------------------------------------------------------------------------------- Totals: 2284ms 2287ms -0.1% 2306ms 2308ms -0.1% (this=pep447.pybench, other=default.pybench) Alternative proposals --------------------- ``__getattribute_super__`` .......................... An earlier version of this PEP used the following static method on classes:: def __getattribute_super__(cls, name, object, owner): pass This method performed name lookup as well as invoking descriptors and was necessarily limited to working only with ``super.__getattribute__``. Reuse ``tp_getattro`` ..................... It would be nice to avoid adding a new slot, thus keeping the API simpler and easier to understand. A comment on `Issue 18181`_ asked about reusing the ``tp_getattro`` slot, that is super could call the ``tp_getattro`` slot of all methods along the MRO. That won't work because ``tp_getattro`` will look in the instance ``__dict__`` before it tries to resolve attributes using classes in the MRO. This would mean that using ``tp_getattro`` instead of peeking the class dictionaries changes the semantics of the `super class`_. References ========== * `Issue 18181`_ contains a prototype implementation Copyright ========= This document has been placed in the public domain. .. _`Issue 18181`: http://bugs.python.org/issue18181 .. _`super class`: http://docs.python.org/3/library/functions.html#super .. _`NotImplemented`: http://docs.python.org/3/library/constants.html#NotImplemented .. _`PyObject_GenericGetAttr`: http://docs.python.org/3/c-api/object.html#PyObject_GenericGetAttr .. _`type`: http://docs.python.org/3/library/functions.html#type .. _`AttributeError`: http://docs.python.org/3/library/exceptions.html#AttributeError .. _`PyObjC`: http://pyobjc.sourceforge.net/ .. _`classmethod`: http://docs.python.org/3/library/functions.html#classmethod

14 44

Relative path in co_filename for zipped modules
by Vitaly Murashev 15 Sep '13

15 Sep '13

Dear Python developers community, Recently I found out that it not possible to debug python code if it is a part of zip-module. Python version being used is 3.3.0 Well known GUI debuggers like Eclipse+PyDev or PyCharm are unable to start debugging and give the following warning: --- pydev debugger: CRITICAL WARNING: This version of python seems to be incorrectly compiled (internal generated filenames are not absolute) pydev debugger: The debugger may still function, but it will work slower and may miss breakpoints. --- So I started my own investigation of this issue and results are the following. At first I took traditional python debugger 'pdb' to analyze how it behaves during debugging of zipped module. 'pdb' showed me some backtaces and filename part for stack entries looks malformed. I expected something like 'full-path-to-zip-dir/my_zipped_module.zip/subdir/test_module.py' but realy it looks like 'full-path-to-current-dir/subdir/test_module.py' Source code in pdb.py and bdb.py (which one are a part of python stdlib) gave me the answer why it happens. The root cause are inside Bdb.format_stack_entry() + Bdb.canonic() Please take a look at the following line inside 'format_stack_entry' method: filename = self.canonic(frame.f_code.co_filename) For zipped module variable 'frame.f_code.co_filename' holds _relative_ file path started from the root of zip archive like 'subdir/test_module.py' And as result Bdb.canonic() method gives what we have - 'full-path-to-current-dir/subdir/test_module.py' --- So my final question. Could anyone confirm that it is a bug in python core subsystem which one is responsible for loading zipped modules, or something is wrong with my zipped module ? If it is a bug could you please submit it into official python bugtracker ? I never did it before and afraid to do something wrong. Thanks, Vitaly Murashev

2 3

Make extension module initialisation more like Python module initialisation
by Stefan Behnel 06 Aug '13

06 Aug '13

Hi, I suspect that this will be put into a proper PEP at some point, but I'd like to bring this up for discussion first. This came out of issues 13429 and 16392. http://bugs.python.org/issue13429 http://bugs.python.org/issue16392 Stefan The problem =========== Python modules and extension modules are not being set up in the same way. For Python modules, the module is created and set up first, then the module code is being executed. For extensions, i.e. shared libraries, the module init function is executed straight away and does both the creation and initialisation. This means that it knows neither the __file__ it is being loaded from nor its package (i.e. its FQMN). This hinders relative imports and resource loading. In Py3, it's also not being added to sys.modules, which means that a (potentially transitive) re-import of the module will really try to reimport it and thus run into an infinite loop when it executes the module init function again. And without the FQMN, it's not trivial to correctly add the module to sys.modules either. We specifically run into this for Cython generated modules, for which it's not uncommon that the module init code has the same level of complexity as that of any 'regular' Python module. Also, the lack of a FQMN and correct file path hinders the compilation of __init__.py modules, i.e. packages, especially when relative imports are being used at module init time. The proposal ============ I propose to split the extension module initialisation into two steps in Python 3.4, in a backwards compatible way. Step 1: The current module init function can be reduced to just creating the module instance and returning it (and potentially doing some simple C level setup). Optionally, after creating the module (and this is the new part), the module init code can register a C callback function that will be called after setting up the module. Step 2: The shared library importer receives the module instance from the module init function, adds __file__, __path__, __package__ and friends to the module dict, and then checks for the callback. If non-NULL, it calls it to continue the module initialisation by user code. The callback ============ The callback is defined as follows:: int (*PyModule_init_callback)(PyObject* the_module, PyModuleInitContext* context) "PyModuleInitContext" is a struct that is meant mostly for making the callback more future proof by allowing additional parameters to be passed in. For now, I can see a use case for the following fields:: struct PyModuleInitContext { char* module_name; char* qualified_module_name; } Both names are encoded in UTF-8. As for the file path, I consider it best to retrieve it from the module's __file__ attribute as a Python string object to reduce filename encoding problems. Note that this struct argument is not strictly required, but given that this proposal would have been much simpler if the module init function had accepted such an argument in the first place, I consider it a good idea not to let this chance pass by again. The registration of the callback uses a new C-API function: int PyModule_SetInitFunction(PyObject* module, PyModule_init_callback callback) The function name uses "Set" instead of "Register" to make it clear that there is only one such function per module. An alternative would be a new module creation function "PyModule_Create3()" that takes the callback as third argument, in addition to what "PyModule_Create2()" accepts. This would require users to explicitly pass in the (second) version argument, which might be considered only a minor issue. Implementation ============== The implementation requires local changes to the extension module importer and a new C-API function. In order to store the callback, it should use a new field in the module object struct. Open questions ============== It is not clear how extensions should be handled that register more than one module in their module init function, e.g. compiled packages. One possibility would be to leave the setup to the user, who would have to know all FQMNs anyway in this case, although not the import file path. Alternatively, the import machinery could use a stack to remember for which modules a callback was registered during the last init function call, set up all of them and then call their callbacks. It's not clear if this meets the intention of the user. Alternatives ============ 1) It would be possible to make extension modules optionally export another symbol, e.g. "PyInit2_modulename", that the shared library loader would call in addition to the required function "PyInit_modulename". This would remove the need for a new API that registers the above callback. The drawback is that it also makes it easier to write broken code because a Python version or implementation that does not support this second symbol would simply not call it, without error. The new C-API function would let the build fail instead if it is not supported. 2) The callback could be made available as a Python function in the module dict, thus also removing the need for an explicit registration API. However, this approach would add overhead to both sides, the importer code and the user provided module init code, as it would require additional dictionary handling and the implementation of a one-time Python function in user code. It would also suffer from the problem that missing support in the runtime would pass silently. 3) The callback could be registered statically in the PyModuleDef struct by adding a new field. This is not trivial to do in a backwards compatible way because the struct would grow longer without explicit initialisation by existing user code. Extending PyModuleDef_HEAD_INIT might be possible but would still break at least binary compatibility. 4) Pass a new context argument into the module init function that contains all information necessary to properly and completely set up the module at creation time. This would provide a much simpler and cleaner solution than the proposed solution. However, it will not be possible before Python 4 as it breaks backwards compatibility with all existing extension modules at both the source and binary level.

5 15