[Cython] CEP1000: Native dispatch through callables
Dag Sverre Seljebotn
d.s.seljebotn at astro.uio.no
Sat Apr 14 00:31:58 CEST 2012
Dag Sverre Seljebotn <d.s.seljebotn at astro.uio.no> wrote:
>Robert Bradshaw <robertwb at gmail.com> wrote:
>>On Fri, Apr 13, 2012 at 2:24 PM, Nathaniel Smith <njs at pobox.com>
>>> On Fri, Apr 13, 2012 at 9:27 PM, Dag Sverre Seljebotn
>>> <d.s.seljebotn at astro.uio.no> wrote:
>>>> Ah, I didn't think about 6-bit or huffman. Certainly helps.
>>>> I'm almost +1 on your proposal now, but a couple of more ideas:
>>>> 1) Let the key (the size_t) spill over to the next specialization
>>>> it is too large; and prepend that key with a continuation code (two
>>>> could together say "iii)-d\0\0" on 32 bit systems with 8bit
>>>> - as continuation). The key-based caller will expect a continuation
>>>> knows about the specialization, and the prepended char will prevent
>>>> matches against the overspilled slot.
>>>> We could even use the pointers for part of the continuation...
>>> I am really lost here. Why is any of this complicated encoding stuff
>>> better than interning? Interning takes one line of code, is
>>> cheap (one dict lookup per call site and function definition), and
>>> lets you check any possible signature (even complicated ones
>>> memoryviews) by doing a single-word comparison. And best of all, you
>>> don't have to think hard to make sure you got the encoding right.
>>> On a 32-bit system, pointers are smaller than a size_t, but more
>>> expressive! You can still do binary search if you want, etc. Is the
>>> problem just that interning requires a runtime calculation? Because
>>> feel like C users (like numpy) will want to compute these compressed
>>> codes at module-init anyway, and those of us with a fancy compiler
>>> capable of computing them ahead of time (like Cython) can instruct
>>> that fancy compiler to compute them at module-init time just as
>>The primary disadvantage of interning that I see is memory locality. I
>>suppose if all the C-level caches of interned values were co-located,
>>this may not be as big of an issue. Not being able to compare against
>>compile-time constants may thwart some optimization opportunities, but
>>that's less clear.
>>It also requires coordination common repository, but I suppose one
>>would just stick a set in some standard module (or leverage Python's
>1) It doesn't work well with multiple interpreter states. Ok, nothing
>works with that at the moment, but it is on the roadmap for Python and
>we should not make it worse.
>You basically *need* a thread safe store separate from any python
>interpreter; though pythread.h does not rely on the interpreter state;
No, it doesn't, unless we want to ship a single(!) .so-file that can be depended upon by all relevant projects. There's just no way for loaded modules to communicate and synchronize that they know about this CEP except through an interpreter...
That's almost impossible to work around in any clean way? (I can think of several very ugly ones...) Unless the multiple interpreter state idea is entirely dead in CPython, interning must be done seperately for each interpreter and the values stored in the module object. Ugh.
>2) you end up with the known comparison values in read-write memory
>segments rather than readonly segments, which is probably worse on
>I really think that anything that we can do to make this near-c-speed
>should be done; none of the proposals are *that* complicated.
>Using keys, NumPy can in the C code choose to be slower but more
>readable; but using interned string forces cython to be slower, cython
>gets no way of choosing to go faster. (to the degree that it has an
>effect; none of these claims were checked)
>>cython-devel mailing list
>>cython-devel at python.org
>Sent from my Android phone with K-9 Mail. Please excuse my brevity.
>cython-devel mailing list
>cython-devel at python.org
Sent from my Android phone with K-9 Mail. Please excuse my brevity.
More information about the cython-devel