Virtualizable Frames getting half removed in trace

This may be a bit of a long post, but I'm trying to provide as much information as possible. I'm attempting to work on a minimalistic Clojure friendly VM. The bytecode is quite a bit like Python and the program I'm testing looks something like this: add-fn (make-code :bytecode [ADD RETURN] :vars [] :consts [] :locals 0 :stacksize 2) inner-code (make-code :bytecode [STORE_LOCAL 2 PUSH_CONST, 0, STORE_LOCAL, 0, NO_OP, ; 6 PUSH_LOCAL, 0, PUSH_LOCAL, 2, EQ, COND_JMP, 26, PUSH_LOCAL 0 PUSH_CONST 1 PUSH_CONST 2 INVOKE 2 STORE_LOCAL, 0, JMP, 6, NO_OP, ;21 PUSH_LOCAL, 0, RETURN] :vars [] :consts [0 1 add-fn] :stacksize 5 :locals 3) outer-code (make-code :bytecode [PUSH_CONST, 0, PUSH_CONST, 1, INVOKE, 1, RETURN] :vars [] :consts [100000, inner-code] :stacksize 2 :locals 0) This program simply increments a local from 0 to 100000. When I tested this using ADD in the inner-code, I ended up with a very tight trace. However, when I added, add-fn the frame for inner trace ends up getting half created at some points. The code for the main interpreter is here: https://bitbucket.org/halgari/clojure-vm/src/a95d278c7540cd16efb025f878c3773... I'm attaching a copy of my latest trace. The part I'm not happy with is at the end of the trace: debug_merge_point(0, 0, 'INVOKE 2') p64 = new_array(1, descr=<ArrayP 8>) +1165: p65 = call(ConstClass(ll_mul__GcArray_Ptr_GcStruct_objectLlT_arrayPtr_Signed), p64, 2, descr=<Callr 8 ri EF=4>) +1251: guard_no_exception(descr=<Guard0x1006f66b0>) [p0, p65, p60, p6, p8, p16, p18] +1299: setarrayitem_gc(p65, 0, p60, descr=<ArrayP 8>) +1338: setarrayitem_gc(p65, 1, ConstPtr(ptr37), descr=<ArrayP 8>) p67 = new_array(2, descr=<ArrayP 8>) +1446: setarrayitem_gc(p67, i30, p60, descr=<ArrayP 8>) +1451: p68 = getarrayitem_gc(p65, 1, descr=<ArrayP 8>) +1462: setarrayitem_gc(p67, i40, p68, descr=<ArrayP 8>) debug_merge_point(1, 1, 'ADD') +1474: p69 = getarrayitem_gc(p67, i48, descr=<ArrayP 8>) +1479: setarrayitem_gc(p67, i48, ConstPtr(null), descr=<ArrayP 8>) +1488: p70 = getarrayitem_gc(p67, i52, descr=<ArrayP 8>) +1500: setarrayitem_gc(p67, i52, ConstPtr(null), descr=<ArrayP 8>) +1509: i71 = getfield_gc(p69, descr=<FieldS clojure.vm.primitives.Integer.inst_int_value 8>) +1513: i72 = getfield_gc(p70, descr=<FieldS clojure.vm.primitives.Integer.inst_int_value 8>) +1517: i73 = int_add(i71, i72) p74 = new_with_vtable(4297160080) +1531: setfield_gc(p74, i73, descr=<FieldS clojure.vm.primitives.Integer.inst_int_value 8>) +1535: setarrayitem_gc(p67, i52, p74, descr=<ArrayP 8>) debug_merge_point(1, 1, 'RETURN') +1540: p75 = getarrayitem_gc(p67, i52, descr=<ArrayP 8>) +1545: setarrayitem_gc(p67, i52, ConstPtr(null), descr=<ArrayP 8>) debug_merge_point(0, 0, 'STORE_LOCAL 0') debug_merge_point(0, 0, 'JMP 6') debug_merge_point(0, 0, 'NO_OP') +1554: jump(p0, p75, p6, p8, p16, p18, i21, i30, i40, i48, i52, descr=TargetToken(4302274768)) I'm not sure why these allocations aren't getting removed. Any thoughts? Thanks, Timothy

Hi Maciej, On 25 February 2014 09:09, Maciej Fijalkowski <fijall@gmail.com> wrote:
ugh that looks really odd, why is p67 not removed escapes my attention
Because we do setarrayitem and getarrayitem on non-constant indexes.
We need tricks to avoid allocating the frame when we *leave* the function. In PyPy it can only be done if we know for sure that nobody can potentially grab a reference to the frame for later (e.g. via exceptions). I'm unsure to remember the latest version of this logic, but there were several ones... A bientôt, Armin.

So I spent two more hours on this this morning and finally got some good results. a) I turned on _immutable_ = True on the Code object. Should have done this before. Then I noticed that the trace contained the creation of the argument list, but that that list was never made. The trace was also making a call out to some C function so that it could do the array = [None] * argc. I couldn't get that to go away even with promoting argc. So I changed pop_values to this instead: def pop_values(frame, argc): if argc == 0: return Arguments([], argc) elif argc == 1: return Arguments([frame.pop()], argc) elif argc == 2: b = frame.pop() a = frame.pop() return Arguments([a, b], argc) assert False Since Clojure only supports up to 20 positional arguments, that'll work just fine. Now the last part of my trace consists of this: +266: label(p0, i26, p5, p7, p15, p17, i21, i25, descr=TargetToken(4302275472)) debug_merge_point(0, 0, 'NO_OP') debug_merge_point(0, 0, 'PUSH_LOCAL 0') debug_merge_point(0, 0, 'PUSH_LOCAL 2') debug_merge_point(0, 0, 'EQ') +280: i27 = int_eq(i21, i26) guard_false(i27, descr=<Guard0x1006f6480>) [p0, p5, p7, p15, p17, i26] debug_merge_point(0, 0, 'COND_JMP 26') debug_merge_point(0, 0, 'PUSH_LOCAL 0') debug_merge_point(0, 0, 'PUSH_CONST 1') debug_merge_point(0, 0, 'PUSH_CONST 2') debug_merge_point(0, 0, 'INVOKE 2') debug_merge_point(1, 1, 'ADD') +289: i28 = int_add(i25, i26) debug_merge_point(1, 1, 'RETURN') debug_merge_point(0, 0, 'STORE_LOCAL 0') debug_merge_point(0, 0, 'JMP 6') debug_merge_point(0, 0, 'NO_OP') +295: jump(p0, i28, p5, p7, p15, p17, i21, i25, descr=TargetToken(4302275472)) Which is exactly what I was looking for, an add and an eq. Thanks for the help everyone! Timothy On Tue, Feb 25, 2014 at 2:56 AM, Armin Rigo <arigo@tunes.org> wrote:
-- "One of the main causes of the fall of the Roman Empire was that-lacking zero-they had no way to indicate successful termination of their C programs." (Robert Firth)

correction on my last email "but that list was never used" On Tue, Feb 25, 2014 at 7:06 AM, Timothy Baldridge <tbaldridge@gmail.com>wrote:
-- "One of the main causes of the fall of the Roman Empire was that-lacking zero-they had no way to indicate successful termination of their C programs." (Robert Firth)

Hi Timothy, On 25 February 2014 15:06, Timothy Baldridge <tbaldridge@gmail.com> wrote:
Ah, digging into it more, it seems that "[None] * argc" is not correctly optimised if argc is an unsigned number rather than a regular signed integer, like in your example. Fixed! A bientôt, Armin.

Hey, The arrays escape because the indexes into the arrays are not constants. p67 = new_array(2, descr=<ArrayP 8>) +1446: setarrayitem_gc(p67, i30, p60, descr=<ArrayP 8>) Here, should i30 be always the same value? If yes, you should promote it before the array access. I couldn't figure out what p67 is, whether it's the stack or the arguments array, but if it's the stack, this might mean changing pop as follows: def pop(self): depth = jit.promote(self.tos) - 1 val = self.data[depth] self.data[depth] = None self.tos = depth return val Cheers, Carl Friedrich On 25/02/14 06:36, Timothy Baldridge wrote:

Hi Maciej, On 25 February 2014 09:09, Maciej Fijalkowski <fijall@gmail.com> wrote:
ugh that looks really odd, why is p67 not removed escapes my attention
Because we do setarrayitem and getarrayitem on non-constant indexes.
We need tricks to avoid allocating the frame when we *leave* the function. In PyPy it can only be done if we know for sure that nobody can potentially grab a reference to the frame for later (e.g. via exceptions). I'm unsure to remember the latest version of this logic, but there were several ones... A bientôt, Armin.

So I spent two more hours on this this morning and finally got some good results. a) I turned on _immutable_ = True on the Code object. Should have done this before. Then I noticed that the trace contained the creation of the argument list, but that that list was never made. The trace was also making a call out to some C function so that it could do the array = [None] * argc. I couldn't get that to go away even with promoting argc. So I changed pop_values to this instead: def pop_values(frame, argc): if argc == 0: return Arguments([], argc) elif argc == 1: return Arguments([frame.pop()], argc) elif argc == 2: b = frame.pop() a = frame.pop() return Arguments([a, b], argc) assert False Since Clojure only supports up to 20 positional arguments, that'll work just fine. Now the last part of my trace consists of this: +266: label(p0, i26, p5, p7, p15, p17, i21, i25, descr=TargetToken(4302275472)) debug_merge_point(0, 0, 'NO_OP') debug_merge_point(0, 0, 'PUSH_LOCAL 0') debug_merge_point(0, 0, 'PUSH_LOCAL 2') debug_merge_point(0, 0, 'EQ') +280: i27 = int_eq(i21, i26) guard_false(i27, descr=<Guard0x1006f6480>) [p0, p5, p7, p15, p17, i26] debug_merge_point(0, 0, 'COND_JMP 26') debug_merge_point(0, 0, 'PUSH_LOCAL 0') debug_merge_point(0, 0, 'PUSH_CONST 1') debug_merge_point(0, 0, 'PUSH_CONST 2') debug_merge_point(0, 0, 'INVOKE 2') debug_merge_point(1, 1, 'ADD') +289: i28 = int_add(i25, i26) debug_merge_point(1, 1, 'RETURN') debug_merge_point(0, 0, 'STORE_LOCAL 0') debug_merge_point(0, 0, 'JMP 6') debug_merge_point(0, 0, 'NO_OP') +295: jump(p0, i28, p5, p7, p15, p17, i21, i25, descr=TargetToken(4302275472)) Which is exactly what I was looking for, an add and an eq. Thanks for the help everyone! Timothy On Tue, Feb 25, 2014 at 2:56 AM, Armin Rigo <arigo@tunes.org> wrote:
-- "One of the main causes of the fall of the Roman Empire was that-lacking zero-they had no way to indicate successful termination of their C programs." (Robert Firth)

correction on my last email "but that list was never used" On Tue, Feb 25, 2014 at 7:06 AM, Timothy Baldridge <tbaldridge@gmail.com>wrote:
-- "One of the main causes of the fall of the Roman Empire was that-lacking zero-they had no way to indicate successful termination of their C programs." (Robert Firth)

Hi Timothy, On 25 February 2014 15:06, Timothy Baldridge <tbaldridge@gmail.com> wrote:
Ah, digging into it more, it seems that "[None] * argc" is not correctly optimised if argc is an unsigned number rather than a regular signed integer, like in your example. Fixed! A bientôt, Armin.

Hey, The arrays escape because the indexes into the arrays are not constants. p67 = new_array(2, descr=<ArrayP 8>) +1446: setarrayitem_gc(p67, i30, p60, descr=<ArrayP 8>) Here, should i30 be always the same value? If yes, you should promote it before the array access. I couldn't figure out what p67 is, whether it's the stack or the arguments array, but if it's the stack, this might mean changing pop as follows: def pop(self): depth = jit.promote(self.tos) - 1 val = self.data[depth] self.data[depth] = None self.tos = depth return val Cheers, Carl Friedrich On 25/02/14 06:36, Timothy Baldridge wrote:
participants (4)
-
Armin Rigo
-
Carl Friedrich Bolz
-
Maciej Fijalkowski
-
Timothy Baldridge