[Python-Dev] Re: [Bug #121013] Bug in <stringobject>.join(<unicodestring>)

Michael Hudson mwh21@cam.ac.uk
Tue, 28 Nov 2000 10:19:14 +0000 (GMT)


On Mon, 27 Nov 2000, M.-A. Lemburg wrote:

> Michael Hudson wrote:
> 
> > > I still think that the PySequence_Fast_GETITEM() macro should at
> > > least include a fall-back type check which causes some exception in
> > > case the used object was not "fixed" using PySequence_Fast() (which
> > > I only noticed in string_join() just now).
> > 
> > It's hard to see how; you're not going to check each invocation of
> > PySequence_Fast_GETITEM for a NULL return, are you?  It's possible
> > that PySequence_Fast should raise an exception on being passed a
> > string or Unicode object... but probably not.
> 
> Since not using PySequence_Fast() to initialize the protocol,
> I'd suggest doing a Py_FatalError() with some explanatory
> text which gets printed to stderr -- still better than a
> segfault at some later point due to some dangling pointers...

So you'd want PySequence_Fast_GETITEM to look like

#define PySequence_Fast_GETITEM(s,i) \
    (PyList_Check((s)) ? PyList_GET_ITEM((s),(i)) : \
          (PyTuple_Check((s))? PyTuple_GET_ITEM((s),(i)) : \
                  Py_FatalError("muffin!") ))

?  That'd probably be fair - you'd want to check any performance hit, but
I wouldn't be surprised if it was neglible.

> > > Fredrik's PySequence_Fast_* APIs look interesting, BTW. Should be
> > > used more often :-)
> > 
> > Yes.  But they're pretty new, aren't they? 
> 
> Yep. Fredrik added them quite late in the 2.0 release process.
> 
> > I find them a bit
> > unsatisfactory that it's not possible to hoist the type check out of
> > the inner loop.  Still, it seems my PII's branch predictor nails that
> > one... (i.e. changing it so that it didn't check inside the loop made
> > naff-all difference to the running time).
> 
> I think Fredrik speculated on having the compiler optimizers
> taking care of the check... 

gcc 2.95.1 doesn't seem to - even with -O9 you get stuff like

.stabn 68,0,825,.LM312-string_join
.LM312:
        movl PyList_Type@GOT(%ebx),%eax
        cmpl %eax,4(%edi)
        jne .L875
        movl 12(%edi),%eax
        jmp .L898
        .p2align 4,,7
.L875:
        leal 12(%edi),%eax
.L898:
        movl -32(%ebp),%edx
        movl (%eax,%edx,4),%esi

... but I'm no expert in assembly reading.

> hmm, it would probably also help
> to somehow declare PyTypeObject slots "const" -- is this possible
> in a struct definition ?

I have no idea!

Cheers,
M.