[pypy-dev] asmgcc versus shadowstack

Mon May 30 03:25:50 EDT 2016

hi armin

I don't have very deep opinions - but I'm worried about one particular
thing. GCC tends to change its IR with every release, would be parsing
this not be a nightmare that has to be updated with each new release
of gcc?

On Mon, May 30, 2016 at 9:18 AM, Armin Rigo <arigo at tunes.org> wrote:
> Hi all,
>
> Recently, we've got a few more of the common bug report "cannot find
> gc roots!".  This is caused by asmgcc somehow failing to parse the
> ".s" files produced by gcc on Linux.
>
> I'm investigating what can be done to improve the situation of asmgcc
> in a more definitive way.  There are basically two solutions:
>
>
> 1) we improve shadowstack.  This is the alternative to asmgcc, which
> is used on any non-Linux platform already.  So far it is around 10%
> slower than asmgcc.
>
> 2) we improve asmgcc by finding some better way than parsing assembler files.
>
>
> I worked during the past month in the branch "shadowstack-perf-2".
> This gives a major improvement on the placement of pushing and popping
> GC roots on the shadow stack.  I think it's worth merging that branch
> in any case.  On x86, it gives roughly 3-4% speed improvements; I'd
> guess on arm it is slightly more.  (I'm comparing the performance
> outside JITted machine code; the JITted machine code we produce is
> more similar.)
>
> The problem is that asmgcc used to be ~10% better.  IMHO, 3-4% is not
> quite enough to be happy and kill asmgcc.  Improving beyond these 3-4%
> seems to require some new ideas.
>
>
> So I'm also thinking about ways to fix asmgcc more generally, this
> time focusing on Linux only; asmgcc contains old code that tries to
> parse MSVC output, and I bet we tried with clang at some point, but
> these attempts both failed.  So let's focus on Linux and gcc only.
>
> Asmgcc does two things with the parsed assembler: it computes the
> stack size at every point, and it tracks some marked variables
> backward until the previous "call" instruction.
>
> I think we can assume that the version of gcc is not older than, say,
> the one on tannit32 (Ubuntu 12.04), which is gcc 4.6.  At least from
> that version, both on x86-32 and x86-64, gcc will emit "CFI
> directives" (https://sourceware.org/binutils/docs/as/CFI-directives.html).
> These are a saner way to get the information about the current stack
> size.
>
> About the backward tracking, we need to have a complete understanding
> of all instructions, even if e.g. for any xmm instruction we just say
> "can't handle GC pointers".  The backward tracking itself is often
> foiled because the assembler is lacking a way to know clearly "this
> call never returns" (e.g. calls to abort(), or to some RPython helper
> that prints stuff and aborts).  In other words, the control flow is
> sometimes hard to get correctly, because a "call" generally returns,
> but not always.  Such mistakes can produce bogus results (including
> "cannot find gc roots!").
>
> What can we do about that?  Maybe we can compile with "-s
> -fdump-final-insns".  This dumps a gcc-specific summary of the RTL,
> which is the final intermediate representation, which looks like it is
> in one-to-one correspondance with the actual assembly.  It would be a
> better input for the backward-tracker, because we don't have to handle
> tons of instructions with unknown effects, and because it contains
> explicit points at which control flow cannot pass.  On the other hand,
> we'd need to parse both the .s and this dump in parallel, matching
> them as we go along.  But I still think it would be better than now.
>
> Of course the best would be to get rid of asmgcc completely...
>
> This mail is meant to be a dump of my current mind's state :-)
>
>
> A bientôt,
>
> Armin.
> _______________________________________________
> pypy-dev mailing list
> pypy-dev at python.org
> https://mail.python.org/mailman/listinfo/pypy-dev