The current memory layout for dictionaries is
unnecessarily inefficient. It has a sparse table of
24-byte entries containing the hash value, key pointer,
and value pointer.
Instead, the 24-byte entries should be stored in a
dense table referenced by a sparse table of indices.
For example, the dictionary:
d = {'timmy': 'red', 'barry': 'green', 'guido': 'blue'}
is currently stored as:
entries = [['--', '--', '--'],
[-8522787127447073495, 'barry', 'green'],
['--', '--', '--'],
['--', '--', '--'],
['--', '--', '--'],
[-9092791511155847987, 'timmy', 'red'],
['--', '--', '--'],
[-6480567542315338377, 'guido', 'blue']]
Instead, the data should be organized as follows:
indices = [None, 1, None, None, None, 0, None, 2]
entries = [[-9092791511155847987, 'timmy', 'red'],
[-8522787127447073495, 'barry', 'green'],
[-6480567542315338377, 'guido', 'blue']]
Only the data layout needs to change. The hash table
algorithms would stay the same. All of the current
optimizations would be kept, including key-sharing
dicts and custom lookup functions for string-only
dicts. There is no change to the hash functions, the
table search order, or collision statistics.
The memory savings are significant (from 30% to 95%
compression depending on the how full the table is).
Small dicts (size 0, 1, or 2) get the most benefit.
For a sparse table of size t with n entries, the sizes are:
curr_size = 24 * t
new_size = 24 * n + sizeof(index) * t
In the above timmy/barry/guido example, the current
size is 192 bytes (eight 24-byte entries) and the new
size is 80 bytes (three 24-byte entries plus eight
1-byte indices). That gives 58% compression.
Note, the sizeof(index) can be as small as a single
byte for small dicts, two bytes for bigger dicts and
up to sizeof(Py_ssize_t) for huge dict.
In addition to space savings, the new memory layout
makes iteration faster. Currently, keys(), values, and
items() loop over the sparse table, skipping-over free
slots in the hash table. Now, keys/values/items can
loop directly over the dense table, using fewer memory
accesses.
Another benefit is that resizing is faster and
touches fewer pieces of memory. Currently, every
hash/key/value entry is moved or copied during a
resize. In the new layout, only the indices are
updated. For the most part, the hash/key/value entries
never move (except for an occasional swap to fill a
hole left by a deletion).
With the reduced memory footprint, we can also expect
better cache utilization.
For those wanting to experiment with the design,
there is a pure Python proof-of-concept here:
http://code.activestate.com/recipes/578375
YMMV: Keep in mind that the above size statics assume a
build with 64-bit Py_ssize_t and 64-bit pointers. The
space savings percentages are a bit different on other
builds. Also, note that in many applications, the size
of the data dominates the size of the container (i.e.
the weight of a bucket of water is mostly the water,
not the bucket).
Raymond
I've received some enthusiastic emails from someone who wants to
revive restricted mode. He started out with a bunch of patches to the
CPython runtime using ctypes, which he attached to an App Engine bug:
http://code.google.com/p/googleappengine/issues/detail?id=671
Based on his code (the file secure.py is all you need, included in
secure.tar.gz) it seems he believes the only security leaks are
__subclasses__, gi_frame and gi_code. (I have since convinced him that
if we add "restricted" guards to these attributes, he doesn't need the
functions added to sys.)
I don't recall the exploits that Samuele once posted that caused the
death of rexec.py -- does anyone recall, or have a pointer to the
threads?
--
--Guido van Rossum (home page: http://www.python.org/~guido/)
It appears there's no obvious link from bugs.python.org to the contributor
agreement - you need to go via the unintuitive link Foundation ->
Contribution Forms (and from what I've read, you're prompted when you add a
patch to the tracker).
I'd suggest that if the "Contributor Form Received" field is "No" in user
details, there be a link to http://www.python.org/psf/contrib/.
Tim Delaney
Hello.
I would like to discuss on the language summit a potential inclusion
of cffi[1] into stdlib. This is a project Armin Rigo has been working
for a while, with some input from other developers. It seems that the
main reason why people would prefer ctypes over cffi these days is
"because it's included in stdlib", which is not generally the reason I
would like to hear. Our calls to not use C extensions and to use an
FFI instead has seen very limited success with ctypes and quite a lot
more since cffi got released. The API is fairly stable right now with
minor changes going in and it'll definitely stablize until Python 3.4
release. Notable projects using it:
* pypycore - gevent main loop ported to cffi
* pgsql2cffi
* sdl-cffi bindings
* tls-cffi bindings
* lxml-cffi port
* pyzmq
* cairo-cffi
* a bunch of others
So relatively a lot given that the project is not even a year old (it
got 0.1 release in June). As per documentation, the advantages over
ctypes:
* The goal is to call C code from Python. You should be able to do so
without learning a 3rd language: every alternative requires you to
learn their own language (Cython, SWIG) or API (ctypes). So we tried
to assume that you know Python and C and minimize the extra bits of
API that you need to learn.
* Keep all the Python-related logic in Python so that you don’t need
to write much C code (unlike CPython native C extensions).
* Work either at the level of the ABI (Application Binary Interface)
or the API (Application Programming Interface). Usually, C libraries
have a specified C API but often not an ABI (e.g. they may document a
“struct” as having at least these fields, but maybe more). (ctypes
works at the ABI level, whereas Cython and native C extensions work at
the API level.)
* We try to be complete. For now some C99 constructs are not
supported, but all C89 should be, including macros (and including
macro “abuses”, which you can manually wrap in saner-looking C
functions).
* We attempt to support both PyPy and CPython, with a reasonable path
for other Python implementations like IronPython and Jython.
* Note that this project is not about embedding executable C code in
Python, unlike Weave. This is about calling existing C libraries from
Python.
so among other things, making a cffi extension gives you the same
level of security as writing C (and unlike ctypes) and brings quite a
bit more flexibility (API vs ABI issue) that let's you wrap arbitrary
libraries, even those full of macros.
Cheers,
fijal
.. [1]: http://cffi.readthedocs.org/en/release-0.5/
hi, everyone:
I want to compile python 3.3 with bz2 support on RedHat 5.5 but fail to do that. Here is how I do it:
1、download bzip2 and compile it(make、make -f Makefile_libbz2_so、make install)
2、chang to python 3.3 source directory : ./configure --with-bz2=/usr/local/include
3、make
4、make install
after installation complete, I test it:
[root@localhost Python-3.3.0]# python3 -c "import bz2"
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/usr/local/lib/python3.3/bz2.py", line 21, in <module>
from _bz2 import BZ2Compressor, BZ2Decompressor
ImportError: No module named '_bz2'
By the way, RedHat 5.5 has a built-in python 2.4.3. Would it be a problem?
Hi all,
I've been looking for a Github mirror for Python, and found two:
* https://github.com/python-git/python has a lot of forks/watches/starts
but seems to be very out of date (last updated 4 years ago)
* https://github.com/python-mirror/python doesn't appear to be very popular
but is updated daily
Are some of you the owners of these repositories? Should we consolidate to
a single "semi-official" mirror?
Eli
Hi,
This PEP proposed to add a __locallookup__ slot to type objects,
which is used by _PyType_Lookup and super_getattro instead of peeking
in the tp_dict of classes. The PEP text explains why this is needed.
Differences with the previous version:
* Better explanation of why this is a useful addition
* type.__locallookup__ is no longer optional.
* I've added benchmarking results using pybench.
(using the patch attached to issue 18181)
Ronald
PEP: 447
Title: Add __locallookup__ method to metaclass
Version: $Revision$
Last-Modified: $Date$
Author: Ronald Oussoren <ronaldoussoren(a)mac.com>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 12-Jun-2013
Post-History: 2-Jul-2013, 15-Jul-2013, 29-Jul-2013
Abstract
========
Currently ``object.__getattribute__`` and ``super.__getattribute__`` peek
in the ``__dict__`` of classes on the MRO for a class when looking for
an attribute. This PEP adds an optional ``__locallookup__`` method to
a metaclass that can be used to override this behavior.
Rationale
=========
It is currently not possible to influence how the `super class`_ looks
up attributes (that is, ``super.__getattribute__`` unconditionally
peeks in the class ``__dict__``), and that can be problematic for
dynamic classes that can grow new methods on demand.
The ``__locallookup__`` method makes it possible to dynamicly add
attributes even when looking them up using the `super class`_.
The new method affects ``object.__getattribute__`` (and
`PyObject_GenericGetAttr`_) as well for consistency.
Background
----------
The current behavior of ``super.__getattribute__`` causes problems for
classes that are dynamic proxies for other (non-Python) classes or types,
an example of which is `PyObjC`_. PyObjC creates a Python class for every
class in the Objective-C runtime, and looks up methods in the Objective-C
runtime when they are used. This works fine for normal access, but doesn't
work for access with ``super`` objects. Because of this PyObjC currently
includes a custom ``super`` that must be used with its classes.
The API in this PEP makes it possible to remove the custom ``super`` and
simplifies the implementation because the custom lookup behavior can be
added in a central location.
The superclass attribute lookup hook
====================================
Both ``super.__getattribute__`` and ``object.__getattribute__`` (or
`PyObject_GenericGetAttr`_ in C code) walk an object's MRO and peek in the
class' ``__dict__`` to look up attributes. A way to affect this lookup is
using a method on the meta class for the type, that by default looks up
the name in the class ``__dict__``.
In Python code
--------------
A meta type can define a method ``__locallookup__`` that is called during
attribute resolution by both ``super.__getattribute__`` and ``object.__getattribute``::
class MetaType(type):
def __locallookup__(cls, name):
try:
return cls.__dict__[name]
except KeyError:
raise AttributeError(name) from None
The ``__locallookup__`` method has as its arguments a class and the name of the attribute
that is looked up. It should return the value of the attribute without invoking descriptors,
or raise `AttributeError`_ when the name cannot be found.
The `type`_ class provides a default implementation for ``__locallookup__``, that
looks up the name in the class dictionary.
Example usage
.............
The code below implements a silly metaclass that redirects attribute lookup to uppercase
versions of names::
class UpperCaseAccess (type):
def __locallookup__(cls, name):
return cls.__dict__[name.upper()]
class SillyObject (metaclass=UpperCaseAccess):
def m(self):
return 42
def M(self):
return "fourtytwo"
obj = SillyObject()
assert obj.m() == "fortytwo"
In C code
---------
A new slot ``tp_locallookup`` is added to the ``PyTypeObject`` struct, this slot
corresponds to the ``__locallookup__`` method on `type`_.
The slot has the following prototype::
PyObject* (*locallookupfunc)(PyTypeObject* cls, PyObject* name);
This method should lookup *name* in the namespace of *cls*, without looking at superclasses,
and should not invoke descriptors. The method returns ``NULL`` without setting an exception
when the *name* cannot be found, and returns a new reference otherwise (not a borrowed reference).
Use of this hook by the interpreter
-----------------------------------
The new method is required for metatypes and as such is defined on `type_`. Both
``super.__getattribute__`` and ``object.__getattribute__``/`PyObject_GenericGetAttr`_
(through ``_PyType_Lookup``) use the this ``__locallookup__`` method when walking
the MRO.
Other changes to the implementation
-----------------------------------
The change for `PyObject_GenericGetAttr`_ will be done by changing the private function
``_PyType_Lookup``. This currently returns a borrowed reference, but must return a new
reference when the ``__locallookup__`` method is present. Because of this ``_PyType_Lookup``
will be renamed to ``_PyType_LookupName``, this will cause compile-time errors for all out-of-tree
users of this private API.
The attribute lookup cache in ``Objects/typeobject.c`` is disabled for classes that have a
metaclass that overrides ``__locallookup__``, because using the cache might not be valid
for such classes.
Performance impact
------------------
The pybench output below compares an implementation of this PEP with the regular
source tree, both based on changeset a5681f50bae2, run on an idle machine an
Core i7 processor running Centos 6.4.
Even though the machine was idle there were clear differences between runs,
I've seen difference in "minimum time" vary from -0.1% to +1.5%, with simular
(but slightly smaller) differences in the "average time" difference.
.. ::
-------------------------------------------------------------------------------
PYBENCH 2.1
-------------------------------------------------------------------------------
* using CPython 3.4.0a0 (default, Jul 29 2013, 13:01:34) [GCC 4.4.7 20120313 (Red Hat 4.4.7-3)]
* disabled garbage collection
* system check interval set to maximum: 2147483647
* using timer: time.perf_counter
* timer: resolution=1e-09, implementation=clock_gettime(CLOCK_MONOTONIC)
-------------------------------------------------------------------------------
Benchmark: pep447.pybench
-------------------------------------------------------------------------------
Rounds: 10
Warp: 10
Timer: time.perf_counter
Machine Details:
Platform ID: Linux-2.6.32-358.114.1.openstack.el6.x86_64-x86_64-with-centos-6.4-Final
Processor: x86_64
Python:
Implementation: CPython
Executable: /tmp/default-pep447/bin/python3
Version: 3.4.0a0
Compiler: GCC 4.4.7 20120313 (Red Hat 4.4.7-3)
Bits: 64bit
Build: Jul 29 2013 14:09:12 (#default)
Unicode: UCS4
-------------------------------------------------------------------------------
Comparing with: default.pybench
-------------------------------------------------------------------------------
Rounds: 10
Warp: 10
Timer: time.perf_counter
Machine Details:
Platform ID: Linux-2.6.32-358.114.1.openstack.el6.x86_64-x86_64-with-centos-6.4-Final
Processor: x86_64
Python:
Implementation: CPython
Executable: /tmp/default/bin/python3
Version: 3.4.0a0
Compiler: GCC 4.4.7 20120313 (Red Hat 4.4.7-3)
Bits: 64bit
Build: Jul 29 2013 13:01:34 (#default)
Unicode: UCS4
Test minimum run-time average run-time
this other diff this other diff
-------------------------------------------------------------------------------
BuiltinFunctionCalls: 45ms 44ms +1.3% 45ms 44ms +1.3%
BuiltinMethodLookup: 26ms 27ms -2.4% 27ms 27ms -2.2%
CompareFloats: 33ms 34ms -0.7% 33ms 34ms -1.1%
CompareFloatsIntegers: 66ms 67ms -0.9% 66ms 67ms -0.8%
CompareIntegers: 51ms 50ms +0.9% 51ms 50ms +0.8%
CompareInternedStrings: 34ms 33ms +0.4% 34ms 34ms -0.4%
CompareLongs: 29ms 29ms -0.1% 29ms 29ms -0.0%
CompareStrings: 43ms 44ms -1.8% 44ms 44ms -1.8%
ComplexPythonFunctionCalls: 44ms 42ms +3.9% 44ms 42ms +4.1%
ConcatStrings: 33ms 33ms -0.4% 33ms 33ms -1.0%
CreateInstances: 47ms 48ms -2.9% 47ms 49ms -3.4%
CreateNewInstances: 35ms 36ms -2.5% 36ms 36ms -2.5%
CreateStringsWithConcat: 69ms 70ms -0.7% 69ms 70ms -0.9%
DictCreation: 52ms 50ms +3.1% 52ms 50ms +3.0%
DictWithFloatKeys: 40ms 44ms -10.1% 43ms 45ms -5.8%
DictWithIntegerKeys: 32ms 36ms -11.2% 35ms 37ms -4.6%
DictWithStringKeys: 29ms 34ms -15.7% 35ms 40ms -11.0%
ForLoops: 30ms 29ms +2.2% 30ms 29ms +2.2%
IfThenElse: 38ms 41ms -6.7% 38ms 41ms -6.9%
ListSlicing: 36ms 36ms -0.7% 36ms 37ms -1.3%
NestedForLoops: 43ms 45ms -3.1% 43ms 45ms -3.2%
NestedListComprehensions: 39ms 40ms -1.7% 39ms 40ms -2.1%
NormalClassAttribute: 86ms 82ms +5.1% 86ms 82ms +5.0%
NormalInstanceAttribute: 42ms 42ms +0.3% 42ms 42ms +0.0%
PythonFunctionCalls: 39ms 38ms +3.5% 39ms 38ms +2.8%
PythonMethodCalls: 51ms 49ms +3.0% 51ms 50ms +2.8%
Recursion: 67ms 68ms -1.4% 67ms 68ms -1.4%
SecondImport: 41ms 36ms +12.5% 41ms 36ms +12.6%
SecondPackageImport: 45ms 40ms +13.1% 45ms 40ms +13.2%
SecondSubmoduleImport: 92ms 95ms -2.4% 95ms 98ms -3.6%
SimpleComplexArithmetic: 28ms 28ms -0.1% 28ms 28ms -0.2%
SimpleDictManipulation: 57ms 57ms -1.0% 57ms 58ms -1.0%
SimpleFloatArithmetic: 29ms 28ms +4.7% 29ms 28ms +4.9%
SimpleIntFloatArithmetic: 37ms 41ms -8.5% 37ms 41ms -8.7%
SimpleIntegerArithmetic: 37ms 41ms -9.4% 37ms 42ms -10.2%
SimpleListComprehensions: 33ms 33ms -1.9% 33ms 34ms -2.9%
SimpleListManipulation: 28ms 30ms -4.3% 29ms 30ms -4.1%
SimpleLongArithmetic: 26ms 26ms +0.5% 26ms 26ms +0.5%
SmallLists: 40ms 40ms +0.1% 40ms 40ms +0.1%
SmallTuples: 46ms 47ms -2.4% 46ms 48ms -3.0%
SpecialClassAttribute: 126ms 120ms +4.7% 126ms 121ms +4.4%
SpecialInstanceAttribute: 42ms 42ms +0.6% 42ms 42ms +0.8%
StringMappings: 94ms 91ms +3.9% 94ms 91ms +3.8%
StringPredicates: 48ms 49ms -1.7% 48ms 49ms -2.1%
StringSlicing: 45ms 45ms +1.4% 46ms 45ms +1.5%
TryExcept: 23ms 22ms +4.9% 23ms 22ms +4.8%
TryFinally: 32ms 32ms -0.1% 32ms 32ms +0.1%
TryRaiseExcept: 17ms 17ms +0.9% 17ms 17ms +0.5%
TupleSlicing: 49ms 48ms +1.1% 49ms 49ms +1.0%
WithFinally: 48ms 47ms +2.3% 48ms 47ms +2.4%
WithRaiseExcept: 45ms 44ms +0.8% 45ms 45ms +0.5%
-------------------------------------------------------------------------------
Totals: 2284ms 2287ms -0.1% 2306ms 2308ms -0.1%
(this=pep447.pybench, other=default.pybench)
Alternative proposals
---------------------
``__getattribute_super__``
..........................
An earlier version of this PEP used the following static method on classes::
def __getattribute_super__(cls, name, object, owner): pass
This method performed name lookup as well as invoking descriptors and was necessarily
limited to working only with ``super.__getattribute__``.
Reuse ``tp_getattro``
.....................
It would be nice to avoid adding a new slot, thus keeping the API simpler and
easier to understand. A comment on `Issue 18181`_ asked about reusing the
``tp_getattro`` slot, that is super could call the ``tp_getattro`` slot of all
methods along the MRO.
That won't work because ``tp_getattro`` will look in the instance
``__dict__`` before it tries to resolve attributes using classes in the MRO.
This would mean that using ``tp_getattro`` instead of peeking the class
dictionaries changes the semantics of the `super class`_.
References
==========
* `Issue 18181`_ contains a prototype implementation
Copyright
=========
This document has been placed in the public domain.
.. _`Issue 18181`: http://bugs.python.org/issue18181
.. _`super class`: http://docs.python.org/3/library/functions.html#super
.. _`NotImplemented`: http://docs.python.org/3/library/constants.html#NotImplemented
.. _`PyObject_GenericGetAttr`: http://docs.python.org/3/c-api/object.html#PyObject_GenericGetAttr
.. _`type`: http://docs.python.org/3/library/functions.html#type
.. _`AttributeError`: http://docs.python.org/3/library/exceptions.html#AttributeError
.. _`PyObjC`: http://pyobjc.sourceforge.net/
.. _`classmethod`: http://docs.python.org/3/library/functions.html#classmethod
Dear Python developers community,
Recently I found out that it not possible to debug python code if it
is a part of zip-module.
Python version being used is 3.3.0
Well known GUI debuggers like Eclipse+PyDev or PyCharm are unable to
start debugging and give the following warning:
---
pydev debugger: CRITICAL WARNING: This version of python seems to be
incorrectly compiled (internal generated filenames are not absolute)
pydev debugger: The debugger may still function, but it will work
slower and may miss breakpoints.
---
So I started my own investigation of this issue and results are the following.
At first I took traditional python debugger 'pdb' to analyze how it
behaves during debugging of zipped module.
'pdb' showed me some backtaces and filename part for stack entries
looks malformed.
I expected something like
'full-path-to-zip-dir/my_zipped_module.zip/subdir/test_module.py'
but realy it looks like 'full-path-to-current-dir/subdir/test_module.py'
Source code in pdb.py and bdb.py (which one are a part of python
stdlib) gave me the answer why it happens.
The root cause are inside Bdb.format_stack_entry() + Bdb.canonic()
Please take a look at the following line inside 'format_stack_entry' method:
filename = self.canonic(frame.f_code.co_filename)
For zipped module variable 'frame.f_code.co_filename' holds _relative_
file path started from the root of zip archive like
'subdir/test_module.py'
And as result Bdb.canonic() method gives what we have -
'full-path-to-current-dir/subdir/test_module.py'
---
So my final question.
Could anyone confirm that it is a bug in python core subsystem which
one is responsible for loading zipped modules,
or something is wrong with my zipped module ?
If it is a bug could you please submit it into official python bugtracker ?
I never did it before and afraid to do something wrong.
Thanks,
Vitaly Murashev
Hi,
I suspect that this will be put into a proper PEP at some point, but I'd
like to bring this up for discussion first. This came out of issues 13429
and 16392.
http://bugs.python.org/issue13429http://bugs.python.org/issue16392
Stefan
The problem
===========
Python modules and extension modules are not being set up in the same way.
For Python modules, the module is created and set up first, then the module
code is being executed. For extensions, i.e. shared libraries, the module
init function is executed straight away and does both the creation and
initialisation. This means that it knows neither the __file__ it is being
loaded from nor its package (i.e. its FQMN). This hinders relative imports
and resource loading. In Py3, it's also not being added to sys.modules,
which means that a (potentially transitive) re-import of the module will
really try to reimport it and thus run into an infinite loop when it
executes the module init function again. And without the FQMN, it's not
trivial to correctly add the module to sys.modules either.
We specifically run into this for Cython generated modules, for which it's
not uncommon that the module init code has the same level of complexity as
that of any 'regular' Python module. Also, the lack of a FQMN and correct
file path hinders the compilation of __init__.py modules, i.e. packages,
especially when relative imports are being used at module init time.
The proposal
============
I propose to split the extension module initialisation into two steps in
Python 3.4, in a backwards compatible way.
Step 1: The current module init function can be reduced to just creating
the module instance and returning it (and potentially doing some simple C
level setup). Optionally, after creating the module (and this is the new
part), the module init code can register a C callback function that will be
called after setting up the module.
Step 2: The shared library importer receives the module instance from the
module init function, adds __file__, __path__, __package__ and friends to
the module dict, and then checks for the callback. If non-NULL, it calls it
to continue the module initialisation by user code.
The callback
============
The callback is defined as follows::
int (*PyModule_init_callback)(PyObject* the_module,
PyModuleInitContext* context)
"PyModuleInitContext" is a struct that is meant mostly for making the
callback more future proof by allowing additional parameters to be passed
in. For now, I can see a use case for the following fields::
struct PyModuleInitContext {
char* module_name;
char* qualified_module_name;
}
Both names are encoded in UTF-8. As for the file path, I consider it best
to retrieve it from the module's __file__ attribute as a Python string
object to reduce filename encoding problems.
Note that this struct argument is not strictly required, but given that
this proposal would have been much simpler if the module init function had
accepted such an argument in the first place, I consider it a good idea not
to let this chance pass by again.
The registration of the callback uses a new C-API function:
int PyModule_SetInitFunction(PyObject* module,
PyModule_init_callback callback)
The function name uses "Set" instead of "Register" to make it clear that
there is only one such function per module.
An alternative would be a new module creation function "PyModule_Create3()"
that takes the callback as third argument, in addition to what
"PyModule_Create2()" accepts. This would require users to explicitly pass
in the (second) version argument, which might be considered only a minor issue.
Implementation
==============
The implementation requires local changes to the extension module importer
and a new C-API function. In order to store the callback, it should use a
new field in the module object struct.
Open questions
==============
It is not clear how extensions should be handled that register more than
one module in their module init function, e.g. compiled packages. One
possibility would be to leave the setup to the user, who would have to know
all FQMNs anyway in this case, although not the import file path.
Alternatively, the import machinery could use a stack to remember for which
modules a callback was registered during the last init function call, set
up all of them and then call their callbacks. It's not clear if this meets
the intention of the user.
Alternatives
============
1) It would be possible to make extension modules optionally export another
symbol, e.g. "PyInit2_modulename", that the shared library loader would
call in addition to the required function "PyInit_modulename". This would
remove the need for a new API that registers the above callback. The
drawback is that it also makes it easier to write broken code because a
Python version or implementation that does not support this second symbol
would simply not call it, without error. The new C-API function would let
the build fail instead if it is not supported.
2) The callback could be made available as a Python function in the module
dict, thus also removing the need for an explicit registration API.
However, this approach would add overhead to both sides, the importer code
and the user provided module init code, as it would require additional
dictionary handling and the implementation of a one-time Python function in
user code. It would also suffer from the problem that missing support in
the runtime would pass silently.
3) The callback could be registered statically in the PyModuleDef struct by
adding a new field. This is not trivial to do in a backwards compatible way
because the struct would grow longer without explicit initialisation by
existing user code. Extending PyModuleDef_HEAD_INIT might be possible but
would still break at least binary compatibility.
4) Pass a new context argument into the module init function that contains
all information necessary to properly and completely set up the module at
creation time. This would provide a much simpler and cleaner solution than
the proposed solution. However, it will not be possible before Python 4 as
it breaks backwards compatibility with all existing extension modules at
both the source and binary level.
Hi,
Guido van Rossum and others asked me details on how file descriptors
and handles are inherited on Windows, for the PEP 446.
http://www.python.org/dev/peps/pep-0446/
I hacked Python 3.4 to add a os.get_cloexec() function (extracted from
my implementation of the PEP 433), here are some results.
Python functions open(), os.open() and os.dup() create file
descriptors with the HANDLE_FLAG_INHERIT flag set (cloexec=False),
whereas os.pipe() creates 2 file descriptors with the
HANDLE_FLAG_INHERIT flag unset (cloexec=False, see also issue #4708).
Even if the HANDLE_FLAG_INHERIT flag is set, all handles are closed if
subprocess is used with close_fds=True (which is the default value of
the parameter), and all file descriptors are closed except 0, 1 and 2.
If close_fds=False, handles with the HANDLE_FLAG_INHERIT flag set are
inherited, but all file descriptors are still closed except 0, 1 and
2.
(I didn't check if file descriptors 0, 1 and 2 are inherited,
duplicated or new file descriptors.)
The PEP 466 allows you to control which handles are inherited to child
process when you use subprocess with close_fds=False. (The subprocess
parameter should be called "close_handles" on Windows to avoid
confusion.)
Said differently: the HANDLE_FLAG_INHERIT flag only has an effect on
*handles*, as indicated in its name. On Windows, file *descriptors*
are never inherited (are always closed) in child processes. I don't
think that it is possible to inherit file descriptors on Windows.
By the way, using pass_fds on Windows raises an assertion error
("pass_fds not supported on Windows").
Another example in Python:
---
import subprocess, sys
code = """
import os, sys
fd = int(sys.argv[1])
f = os.fdopen(fd, "rb")
print(f.read())
"""
f = open(__file__, "rb")
fd = f.fileno()
subprocess.call([sys.executable, "-c", code, str(fd)], close_fds=False)
---
On Unix, the child process will write the script into stdout. On
Windows, you just get an OSError(9, "Bad file descriptor") exception.
To fix this example on Windows, you have to:
* Retrieve the handle of the file using msvcrt.get_osfhandle() ;
* Pass the handle, instead of the file descriptor, to the child ;
* Create a file descriptor from the handle using
msvcrt.open_osfhandle() in the child.
The fix would be simpler if Python would provide the handle of a file
object (ex: in a method) and if open() supported opening a handle as
it does with file descriptors on UNIX.
Victor