[Python-Dev] C-level duck typing
Stefan Behnel
stefan_ml at behnel.de
Wed May 16 11:22:31 CEST 2012
"Martin v. Löwis", 16.05.2012 10:36:
>> And, we want this to somehow work with existing Python; we still
>> support users on Python 2.4.
>
> This makes the question out-of-scope for python-dev - we only discuss
> new versions of Python here. Old versions cannot be developed anymore
> (as they are released already).
Well, it's in scope because CPython would have to support this in a future
version, or at least have to make sure it knows about it so that it can
stay out of the way (depending on how the solution eventually ends up
working). We're also very much interested in input from the CPython core
developers regarding the design, because we think that this should become a
general feature of the Python platform (potentially also for PyPy etc.).
The fact that we need to support it in older CPython versions is also
relevant, because the solution we choose shouldn't conflict with older
versions. The fact that they are no longer developed actually helps,
because it will prevent them from interfering in the future any more than
they do now.
>> typedef struct {
>> unsigned long extension_id;
>> void *data;
>> } PyTypeObjectExtensionEntry;
>>
>> and then a type object can (somehow!) point to an array of these. The
>> array is linearly scanned
>
> It's unclear to me why you think that a linear scan is faster than
> a dictionary lookup. The contrary will be the case - the dictionary
> lookup (PyObject_GetAttr) will be much faster.
Agreed in general, but in this case, it's really not that easy. A C
function call involves a certain overhead all by itself, so calling into
the C-API multiple times may be substantially more costly than, say,
calling through a function pointer once and then running over a returned C
array comparing numbers. And definitely way more costly than running over
an array that the type struct points to directly. We are not talking about
hundreds of entries here, just a few. A linear scan in 64 bit steps over
something like a hundred bytes in the L1 cache should hardly be measurable.
This might sound like a premature micro optimisation, but these things can
quickly add up, e.g. when running a user provided function over a large
array. (And, yes, we'd try to do caching and all that...)
Stefan
More information about the Python-Dev
mailing list