[C++-sig] boost::python with virtual inheritance and g++ c++0x/11 (testcase attached)

Niall Douglas s_sourceforge at nedprod.com
Tue May 22 16:43:06 CEST 2012


On 21 May 2012 at 18:43, Jonas Wielicki wrote:

> On 21.05.2012 17:55, Niall Douglas wrote:
> > 1. Does the bug occur in non-optimised as well as optimised builds?
> It does.

Joy.

> > 2. Does the bug occur when C++11 is turned off?
> It does not.

Not so much joy. It could, technically speaking, be a 
misinterpretation of C++03 by either BPL or GCC. Still, at least it 
narrows things down.

> > 5. Are you using precompiled headers? If so, does the bug occur when 
> > you turn those off?
> I think I am not. I would know if I did, wouldn't I? At least I can
> browse through the header sources at /usr/include/boost/..

Precompiled headers on GCC are basically a dump of state just after 
processing the headers. As a result, the file is huge. That might 
help you find out if they're on. I believe bjam defaults them to off.

> > 6. Can you spot where in the assembly it's making a pointer 
> > dereferencing mistake?
> That is in the template, but the pointer value changes inside code which
> is compiled in the boost library.

Are you saying that the *offset* changes inside the code?

So, class A has vtable at this[-8].

I access a->foo() which indexes this[-8] by 0x16 in the headers.

But inside the compiland, a->foo() indexes this[-8] by 0x20, which is 
wrong.

> Within the function defined around inheritance.cpp:392, the value of the
> object pointer (I think it's called p) changes. I was yet unable to find
> the specific point where the value is changed, because a lot of
> subfunctions get called in there and, to be honest, I'm not that
> familiar with gcc yet. Also it seems as maybe the wrong value is only
> passed, while it is still intact on stack (gcc at least shows me
> differing values for the two stackframes), but that might be due to
> debug data or gcc magic?

Yeah the GCC ABI is a bit fusty. Got a lot better in 3.2 onwards 
though. You might find 
http://sourcery.mentor.com/public/cxx-abi/cxx-vtable-ex.html useful 
as a reference for the C++03 ABI.

Of course, almost certainly the bug you're seeing is related to the 
ABI changes they're making for C++11, of which there are quite a few. 
If I had to take a guess, your problems might have something to do 
with fixing bugs in decltype support e.g. one of the changelog items 
in 4.7 is "The representation of C++ virtual thunks and aliases (both 
implicit and defined via the alias attribute) has been re-engineered. 
Aliases no longer pose optimization barriers and calls to an alias 
can be inlined and otherwise optimized."

BTW - can I just clarify you ARE compiling the entire of BPL using 
C++11 throughout? Linking C++11 to C++03 is *supposed* to work (but 
not the other way round), but I can see nests of vipers in it.

> > If you're *really* unlucky, the bug is that different assembler is 
> > being generated in different compilands and the fact you're seeing a 
> > problem is due to sheer chance because of the stochastic choice the 
> > linker made :)
> Actually, the problem is deterministic. I compiled the binary many times
> now and with gcc 4.7 I always get the segfault, at the same instruction,
> with the same surroundings (changed pointer value from one stack frame
> to the other).

Given the information so far, you have an excellent chance of getting 
it fixed.

Another thought - I once persuaded the GCC devs to accept a bug when 
I demonstrated that ICC (Intel's compiler) got it right. ICC is free 
on Linux, so that might be worth a shot too.

Niall

-- 
Technology & Consulting Services - ned Productions Limited.
http://www.nedproductions.biz/. VAT reg: IE 9708311Q.
Work Portfolio: http://careers.stackoverflow.com/nialldouglas/





More information about the Cplusplus-sig mailing list