[Cython] Hash-based vtables

Dag Sverre Seljebotn d.s.seljebotn at astro.uio.no
Tue Jun 12 22:00:35 CEST 2012


On 06/12/2012 09:46 PM, Dag Sverre Seljebotn wrote:
> On 06/12/2012 08:12 PM, Robert Bradshaw wrote:
>> On Tue, Jun 12, 2012 at 10:21 AM, Dag Sverre Seljebotn
>> <d.s.seljebotn at astro.uio.no> wrote:
>>> On 06/12/2012 01:01 PM, Dag Sverre Seljebotn wrote:
>>>>
>>>> On 06/10/2012 11:53 AM, Robert Bradshaw wrote:
>>>>>
>>>>> On Sun, Jun 10, 2012 at 1:43 AM, Dag Sverre Seljebotn
>>>>>>
>>>>>> About signatures, a problem I see with following the C typing is that
>>>>>> the
>>>>>> signature "ill" wouldn't hash the same as "iii" on 32-bit Windows and
>>>>>> "iqq"
>>>>>> on 32-bit Linux, and so on. I think that would be really bad.
>>>>>
>>>>>
>>>>> This is why I suggested promotion for scalars (divide ints into
>>>>> <=sizeof(long) and sizeof(long)< x<= sizeof(long long)), checked at
>>>>> C compile time, though I guess you consider that evil. I don't
>>>>> consider not matching really bad, just kind of bad.
>>>>
>>>>
>>>> Right. At least a convention for promotion of scalars would be good
>>>> anyway.
>>>>
>>>> Even MSVC supports stdint.h these days; basing ourselves on the random
>>>> behaviour of "long" seems a bit outdated to me. "ssize_t" would be
>>>> better motivated I feel.
>>>>
>>>> Many linear algebra libraries use 32-bit matrix indices by default, but
>>>> can be swapped to 64-bit indices (this holds for many LAPACK
>>>> implementations and most sparse linear algebra). So often there will at
>>>> least be one typedef that is either 32 bits or 64 bits without the
>>>> Cython compiler knowing.
>>>>
>>>> Promoting to a single type "[u]int64" is the only one that removes
>>>> possible combinatorial explosion if you have multiple external typedefs
>>>> that you don't know the size of (although I guess that's rather
>>>> theoretical).
>>>>
>>>> Anyway, runtime table generation is quite fast, see below.
>>>>
>>>>>
>>>>>> "l" must be banished -- but then one might as well do "i4i8i8".
>>>>>>
>>>>>> Designing a signature hash where you select between these at
>>>>>> compile-time is
>>>>>> perhaps doable but does generate a lot of code and makes everything
>>>>>> complicated.
>>>>>
>>>>>
>>>>> It especially gets messy when you're trying to pre-compute tables.
>>>>>
>>>>>> I think we should just start off with hashing at module load
>>>>>> time when sizes are known, and then work with heuristics and/or build
>>>>>> system
>>>>>> integration to improve on that afterwards.
>>>>>
>>>>>
>>>>> Finding 10,000 optimal tables at runtime better be really cheap than
>>>>> for Sage's sake :).
>>>>
>>>>
>>>> The code is highly unpolished as I write this, but it works so here's
>>>> some preliminary benchmarks.
>>>>
>>>> Assuming the 64-bit pre-hashes are already computed, hashing a 64-slot
>>>> table varies between 5 and 10 us (microseconds) depending on the set of
>>>> hashes.
>>>>
>>>> Computing md5's with C code from ulib (not hashlib/OpenSSL) takes
>>>> ~400ns
>>>> per hash, so 26 us for the 64-slot table => it dominates!
>>>>
>>>> The crapwow64 hash takes ~10-20 ns, for ~1 us per 64-slot table.
>>>> Admittedly, that's with hand-written non-portable assembly for the
>>>> crapwow64.
>>>>
>>>> Assuming 10 000 64-slot tables we're looking at something like 0.3-0.4
>>>> seconds for loading Sage using md5, or 0.1 seconds using crapwow64.
>>>>
>>>>
>>>> https://github.com/dagss/pyextensibletype/blob/master/include/perfecthash.h
>>>>
>>>>
>>>> http://www.team5150.com/~andrew/noncryptohashzoo/CrapWow64.html
>>>
>>>
>>> Look: A big advantage of the hash-vtables is that subclasses stay
>>> ABI-compatible with superclasses, and don't need recompilation when
>>> superclasses adds or removes methods.
>>>
>>> => Finding the hash table must happen at run-time in a lot of cases
>>> anyway,
>>> so I feel Robert's chase for a compile-time table building is moot.
>>>
>>> I feel this would also need to trigger automatically heap-allocated
>>> tables
>>> if the statically allocated. Which is good to have in the very few cases
>>> where a perfect table can't be found too.
>>
>> Finding the hash table at runtime should be supported, but the *vast*
>> majority of methods sets is known at compile time. 0.4 seconds is a
>> huge overhead to just add to Sage (yes, it's an exception, but an
>> important one), and though crapwow64 helps I'd rather rely on a known,
>> good standard hash. I need to actually look at Sage to see what the
>> impact would be. Also, most tables would probably have 2 entries in
>> them (e.g. a typed one and an all-object one).
>
> Hopefully 0.4 was a severe overestimate once one actually looks at this.
>
> What's loaded at startup -- is it the pyx files in sage/devel/sage? My
> count (just cloned from github.com/sagemath/sage):
>
> $ find -name '*.pyx' -exec grep 'cdef class' {} \; | wc -l
> 641
>
> And I doubt that *all* of that is loaded at Sage startup, you need to do
> some manual importing for at least some of those classes? So it's
> probably closer to 0.01-0.02 seconds than 0.4 even with md5?
>
> About the *vast* majority of method sets being known: That may be the
> case for old code, but keep in mind that that situation might
> deteriorate. Once we have hash-based vtables, declaring methods of cdef
> classes in pxd files could become optional (and only be there to help
> callers, incl. subclasses, determine the signature). So any method
> that's only used in the superclass and is therefore not declared in the
> pxd file would consistently trigger a run-time build of the table of
> subclasses; the compile-time generated table would be useless then.
>
> (OTOH, as duck-typing becomes the norm, more cdef classes will be
> without superclasses...)
>
>> long int will continue to be an important type as long as it's the
>> default for int literals and Python's "fast" ints (whether in type or
>> implementation), so we can't just move to stdint. I also don't like
>> that the form of the table (and whether certain signatures match)
>> being platform-dependent: the less variance we have from one platform
>> to the next is better.
>
> Perhaps in Sage there's a lot of use of "long" and therefore this would
> make Sage code vary less between platforms.
>
> But for much NumPy-using code you'd typically use int32 or int64, and
> since long is 32 bits on 32-bit Windows and 64 bits on Linux/Mac,
> choosing long sort of maximises inter-platform variation of signatures...
>

Also, promotion can't be used for pointers, buffers, ndarray dtypes...

I don't mind heuristics that work in 99.9% of the cases. Heuristics that 
work in 80% of the cases seem more like a time drain though.

But if there's indeed a problem with Sage load times, and a particular 
set of heuristics allows us to overcome what is otherwise a blocker for 
attaching these tables to cdef classes, then sure.

Dag

>> On an orthogonal note, sizeof(long)-sensitive tables need not be
>> entirely at odds with compile-time table compilation, as most
>> functions will probably have 0 or 1 parameters that are of unknown
>> size, so we could spit out 1 or 2 statically compiled tables and do
>> generate the rest on the fly. I still would rather have fixed
>> Cython-compile time tables though.
>
> Well, I'd "rather have" that as well if it worked every time.
>
> But there's no use designing a feature which works great unless you use
> the fftw_complex type (can be 64 or 128 bits). Or works great unless you
> use 64-bit LAPACK. Or works great unless you have a superclass with a
> partially defined pxd file.
>
> Since one implementation of a concept is simpler than two, then as long
> as run-time generation code must always be there (or at least, be there
> in the common cases x, y, and z), the reasons should be very good for
> adding a compile-time implementation.
>
> Sage taking 0.4 seconds extra would indeed be a very good reason, but I
> don't believe it. So when you can get around to it it'd be great to have
> the actual number of classes (and ideally an estimate for number of
> methods per class).
>
> Dag
> _______________________________________________
> cython-devel mailing list
> cython-devel at python.org
> http://mail.python.org/mailman/listinfo/cython-devel



More information about the cython-devel mailing list